Re: [MLS] New Parent Hash

Richard Barnes <rlb@ipv.sx> Tue, 12 January 2021 15:52 UTC

Return-Path: <rlb@ipv.sx>
X-Original-To: mls@ietfa.amsl.com
Delivered-To: mls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2876C3A09E8 for <mls@ietfa.amsl.com>; Tue, 12 Jan 2021 07:52:19 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.896
X-Spam-Level:
X-Spam-Status: No, score=-1.896 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=ipv-sx.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id UJDojTGG7wPz for <mls@ietfa.amsl.com>; Tue, 12 Jan 2021 07:52:12 -0800 (PST)
Received: from mail-qk1-x733.google.com (mail-qk1-x733.google.com [IPv6:2607:f8b0:4864:20::733]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 568383A09E1 for <mls@ietf.org>; Tue, 12 Jan 2021 07:52:12 -0800 (PST)
Received: by mail-qk1-x733.google.com with SMTP id c7so2232713qke.1 for <mls@ietf.org>; Tue, 12 Jan 2021 07:52:12 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ipv-sx.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=ffOmpD7A51p4LX7/WtjhJIQK1IOzO/EwuKJlRXPNbb0=; b=pEXodi8jlW7oF7cyxYvN8ALT+vB2WnTlvKcMOSkZyRCcXWnx0sKYyE32Xg7htPfPKR HAGgU52vJt512f65i/csPRC1P7H5UUlaI4K5/ITFIv7Td9HTzOTKPkmuWtX5aCKuyRpu h9lRZRfXmlX3TSCTY3od8+JFT+sctditPJk/jPc3LvO0Gsi+KZqtwBh73bh94/fpLw/J sFItnvTWCZx9Zp08F4gFPVPK5Bx9JMSDX5+CwxGwY4hyOnkBM57myS64aQEgnFHhgr2k c5fJC65P5aC97m/e6F1NbPnlltjyY2hIoXswFGqI4sxQGHn9YPksSvFAPOTwCrualL8i oIBA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=ffOmpD7A51p4LX7/WtjhJIQK1IOzO/EwuKJlRXPNbb0=; b=t4lp/gh+ZztRwn/cJfioYnXGCE33qs4X/rvEue+TKIqfnbWaz7B1L1YOkmOicIOgjY yCbSzS4Yzoxv1zpP5UPpl4L9U3rtAd1yRxIqbkm2FCfOutKNL78OeRQHQ4VrPqtHlJWK KQW+YnJykrexhmv0lGuvzE+1RNX+p17pXu0/BUbHTTOQIBKuvKkKeqJvTtc1L2NLEU9Z g47BUZQa5pYr0YJsZEX7k+4W99cSmcLowUB4UY/3DEjvej+NrGwUVc/pBhG0FSAmUNSM utUXi67b68CogaVllBdius0F57PN7VYVGl5ekDRoMlwsrVU0uFXRTKeEY3UaCWL4eTbn zKRQ==
X-Gm-Message-State: AOAM530vW0HrOxjr9VfO3PHKbgOxPuj7tyk4FRsDYC1SwZvI5aNyDR7+ JZQoJmjhaHcW32jbpsTAMQf4EwHgCCGxWyoFTND2mnurh6Y=
X-Google-Smtp-Source: ABdhPJxnZIYqoGqbHN87j4iG0x3ztsw0gBCKy2JTTsyNk+lQLe3syv3r1ZzH8qh8vgyG1P8aiZuwRxJxodgIMU5beLM=
X-Received: by 2002:a05:620a:2009:: with SMTP id c9mr5232036qka.159.1610466730992; Tue, 12 Jan 2021 07:52:10 -0800 (PST)
MIME-Version: 1.0
References: <68acccb2-9e5f-f52d-b32c-3b6e3195bc2d@wickr.com> <7A25FA3A-25E9-47AD-9D8F-112F24748062@wire.com> <c069219e-2738-4cd3-b796-9f695b95c729@www.fastmail.com> <32c5c409abbc4cf79ecf74c9e2a2befa@inf.ethz.ch> <8C2A0364-90BA-4789-9163-E31D8C5B5CB1@wire.com> <1b6230e4-e634-18aa-cfe2-8c35f1b645c2@wickr.com> <4135BB5E-4C35-4EFF-91B6-1615C3560F1E@wire.com> <54f2ca71-89e1-1ca3-b783-2ad774564127@wickr.com> <CAL02cgRAh+wYz4PzxRY27XK6utHgvDAGJOSGQ=sx6N8MdJTYxQ@mail.gmail.com> <9bfee582-1227-754a-c70f-d59012e6ea82@wickr.com>
In-Reply-To: <9bfee582-1227-754a-c70f-d59012e6ea82@wickr.com>
From: Richard Barnes <rlb@ipv.sx>
Date: Tue, 12 Jan 2021 10:51:53 -0500
Message-ID: <CAL02cgQoVGB_DMWb3Mz_+cp5B6ucRZJdG3s3AWBXL1WPRxvx2g@mail.gmail.com>
To: Joel Alwen <jalwen@wickr.com>
Cc: Raphael Robert <raphael@wire.com>, "mls@ietf.org" <mls@ietf.org>, Mularczyk Marta <marta.mularczyk@inf.ethz.ch>
Content-Type: multipart/alternative; boundary="0000000000006c88a305b8b60139"
Archived-At: <https://mailarchive.ietf.org/arch/msg/mls/Nu_ZHUFBGqsA1Y_P_omZ-8DR2is>
Subject: Re: [MLS] New Parent Hash
X-BeenThere: mls@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Messaging Layer Security <mls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mls>, <mailto:mls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mls/>
List-Post: <mailto:mls@ietf.org>
List-Help: <mailto:mls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mls>, <mailto:mls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Jan 2021 15:52:19 -0000

Hey Joël,

Sorry for a bit of delay, took me a minute to get my thoughts in order.
But the answer seems pretty clear to me now.

tl;dr: On further thought, the “descendants” approach is O(N^2) for new
joiners.  So I’m inclined keep the resolution approach.

The parent hash for a parent node P incorporates some input from a copath
child C.  If I understand the security requirement correctly, you want the
copath child’s input to describe the subtree as it was the last time the
parent node was updated.  In at least enough detail to tell which private
keys received the private key for P, possibly in more detail.  This input
needs to remain constant as long as the public key in P remains the same,
so that validation is possible.

That raises the question of what details of the subtree under C can change
without the public key in P also being changed.   The only operations we
have that change the tree are Add, Update, Remove, and Commit.  Update and
Remove within the subtree result in the path from the leaf being blanked,
so the parent node will also be blanked.  A Commit within the subtree will
also replace P.  Only Add changes the subtree without changing the parent,
and the only change it makes is to (1) set the key package at the affected
leaf, and (2) add an unmerged leaf to the leaf node’s ancestors.

This implies that you can “rewind” the state of the tree to the state it
was in when the parent node was set by (1) resetting  the leaves in
P.unmerged_leaves to blanks and (2) removing P.unmerged_leaves from all
unmerged_leaves lists in the subtree.  This “rewound” tree is constant over
the lifetime over P.public_key, so any summary of it is suitable as an
input to the parent hash, for example:

* The resolution of the subtree root C [Alwen, Jost, Mularczyk]
* The tree hash of the subtree [Bhargavan, Beurdouche]
* A hash over the public keys [Robert, Barnes] — though this is not
sufficient to tell which nodes received P’s private key, and thus does not
meet the security goals

The necessity of this “rewind”, though, means that the implementation of
the copath child input basically needs to be “dynamic”.  To capture the
resolution, the copath child input needs to capture the “rewound” unmerged
leaf state, which means that the copath child input depends on the parent
as well as the copath child.  I had been thinking that there was a “static”
solution here, where you compute one thing per node, and update it when
something below the node changed, like with the tree hash.  Here, though, a
node’s contribution to a parent hash depends on the parent, you would need
to store several values per node, which removes any efficiency advantage.

Returning to the “resolution” vs. “descendants” decision… The “descendants”
option in this framing would entail computing either the tree hash or a
“tree hash lite” on the copath child’s subtree, where the “lite” variant
would omit some things from the leaves (e.g., only covering HPKE public
keys).  Because it has to be dynamic, that requires O(n) hash operations
for a full Update Path, with each step in the tree doing ~2^khashes, with k
increasing as you go up the tree.

In other words, we have the following trade-off:

* Resolution: Assures tree invariant only, costs ~log(N) ~ N hashes per
update path node  / parent hash
* Descendants: Assures full tree structure, costs n hashes per update path
node / parent hash

These costs are borne by the sender and the receiver of an UpdatePath.  A
new joiner verifying a tree will have to verify ~N parent hashes, so the
number of hash invocations will be O(N^2) in the worst case.  (Side note:
the verifier might actually have to do *double* that work, since it needs
to test both of its children.  We might want to add a “last updated child”
bit next to the “parent hash” field, so that the verifier can tell which
one was more recently updated, and thus should be correct.)

That latter cost factor of N makes the efficiency of the resolution
approach seem even more important.  While the extra rigidity from the
descendant approach seems appealing, it doesn’t seem essential.  Net of all
that, I’m inclined to keep the resolution approach.

—Richard

On Wed, Jan 6, 2021 at 9:31 AM Joel Alwen <jalwen@wickr.com> wrote:

> Responding inline...
>
> On 05/01/2021 15:48, Richard Barnes wrote:
> > Note: I am assuming that the "descendant hash" would be tree-structured
> like
> > the parent hash; it just would cover only the HPKE public keys instead
> of the
> > whole nodes.
>
> This would be a departure from the original Benjamin-Karthik approach as
> they
> included the Tree Hash of the co-path child when computing the Parent Hash
> of a
> node. Tree Hash covers more than the HPKE pks (e.g. the whole key package
> at a
> leaf and (in recent drafts) also the list of unmerged leaves at each
> node). So
> including only the HPKE leaves in a subtree introduces a 3rd hash for a
> node
> which would have to be computed and maintained by all members... I'm not
> sure
> that's worth it.
>
> I'm also not 100% clear yet on what the verification algorithm is when
> joining a
> group. Here's what I'm missing:
>
> For tree T and node V in T with child node W let PH(T,V,W) be the parent
> hash of
> V in T using co-path child W. Say, at time (epoch) t1 node V in ratchet
> tree T1
> was just assigned a new HPKE key pair with direct path child U and co-path
> child
> W. Later, you join the group (and V still has the same PK as in T1). You
> get the
> current tree T2 and want check PH(T1,V,W) is stored at U. Can anyone with a
> clearer picture than me (e.g. Richard?) sketch the algo to recompute
> PH(T1,V,W)
> given tree T2? Pseudocode, text whatever. (Or if that's not efficient on
> its own
> maybe sketch how to recompute PH(T,X,Y) you'd want to compute as part of
> verification. Amortization could help here since, done right, you may be
> able to
> reuse work from pasts PH computations in the subequent ones.)
>
> > From an implementation point of view, there's not a ton of difference
> between
> > these two
>
> I dunno... My guess is such tree verification would be more complicated
> (if not
> even asymptotically worse) than the resolution version from in Draft 11...
>
> E.g. one issue is that, because of unmerged leaves, sometimes we have to do
> compute hashes of multiple subsets of PKs from a nodes subtree. Here's an
> example. Extend the above example as follows:
>  - Unmerged leaf Z was added to W's subtree at time t3 : t1 < t3 < t2.
>  - Parent P of V gets an HPKE key with co-path child V at time t4 : t3 <
> t4 < t2.
>  - Let W be not blank in T2.
> To verify tree T2 we must compute both PH(T1,V,W) and PH(T4,P,V).
>
> For the resolution approach its pretty simple. For PH(T1,V,W) get W's
> resolution
> in T2 and remove Z (as Z is also be unmerged at V). For PH(T4,P,V) get P's
> resolution and remove any leaves that are unmerged at P.
>
> But for the descendants approach I think you need a fair bit more work. For
> PH(T1,V,W) you need the hash of W's descendent *excluding Z*. But for
> PH(T4,P,V)
> you need the hash of W's descendents *including Z*. I think that means
> having to
> re-visit every node from Z to W when computing PH(T4,P,V) despite having
> already
> visited all but P while computing PH(T1,V,W).
>
> In any case, I without some clearer idea of the verification algorithm I
> can't
> say I'd be in favor of switching to a descendant based Parent Hash.
>
> - Joël
>
>
> > - The "resolution" approach (in draft-11), which incorporates the
> minimum
> > information necessary to assure the tree invariant - The "descendants"
> > approach, which binds in the entire structure of the tree (in units of
> HPKE
> > keys)
> >
> > From an implementation point of view, there's not a ton of difference
> > between these two, but the "descendants" approach seems a bit nicer to
> me.
> > In the "resolution" case, you can re-use resolution logic; in the
> > "descendants" case, you can re-use the "update the changed nodes" logic
> that
> > the tree hash requires.  If anything, the "descendants" approach is a
> little
> > more "static", in that you can store the descendant hashes on the nodes,
> then
> > at parent-hash time, you can just grab the descendant-hash from the
> > appropriate child.  A tree-structured hash also seems easier to get right
> > than making sure the resolution is in the right order, and it's
> compatible
> > with the idea that you mainly need to know about the copath of an
> UpdatePath
> > in order to process it.
> >
> > From a security POV, it seems like either is acceptable, since they both
> > enforce the tree invariant.  The extra structure enforcement doesn't seem
> > like it adds much in terms of security, but it does seem a little easier
> to
> > think about, since you don't have to keep in mind the caveat that the
> > structure can change below the resolution points.
> >
> > So net/net, I have a slight preference for the "descendants" approach, if
> > people are OK with the extra rigidity.
> >
> > --Richard
> >
> > --Richard
> >
> > On Mon, Jan 4, 2021 at 8:33 AM Joel Alwen <jalwen@wickr.com
> > <mailto:jalwen@wickr.com>> wrote:
> >
> >
> > On 02/01/2021 18:02, Raphael Robert wrote:
> >> Thanks Marta and Joël for the explanation! That was the delta that was
> >> missing for my understanding and it’s definitely good to have that on
> >> list.
> >
> > Yes, good call. FYI there's a MUCH more complete discussion of the
> security
> > properties of MLS we've analyzed, the adversarial and communication
> models
> > we used and the actual security proof in eprint/2020/1327. (We'll be
> updating
> > it soon to include our security analysis of the new parent hash scheme.)
> >
> >> Just for context, what I had in mind was something that was not as
> >> detrimental to the membership deniability. I think in that respect both
> the
> >> “resolution” and the “descendants” approach are equivalent.
> >
> > I agree that the resolution method is hardly deniable in any formal
> sense I
> > know of. Switching from signing (hashes of) the entire list of certs of
> > everyone in a subtree to "only" signing HPKE pubkeys in the resolution
> does
> > feel like a pretty superficial improvement for deniability. :-/ To me,
> the
> > advantage of using resolutions is the simplicity & efficiency compared to
> > using descendants.
> >
> > TBH, for deniability I think there's a whole family of impossibilities
> > waiting to be formalized here (like the one in Thm. 1 of
> eprint/2019/1153).
> > Asynchronous + authenticated + (strongly?!) deniable + no global trusted
> > setup + communication complexity sub-linear in the group size? Seems very
> > hard to me (in my non-expert opinion). My guess is that real compromises
> in
> > the model / design goals need to be made to get (even just weak)
> > deniability.
> >> I’d however like to leave it as an open question whether we shouldn’t
> > reconsider locking down entire subtrees as a security-in-depth measure
> if we
> > can convince ourselves the performance toll is not too high.
> >
> > If we can do it, without paying too much of an efficiency price, then by
> all
> > means why not? I'm still curious what we'd be getting though. E.g. what
> needs
> > to happen so that hashing in descendants is secure but using only the
> > resolution isn't?
> >
> >> The tree hash has to be calculated for every epoch change anyway and
> it’s
> > always in O(n).
> >
> > For my own edification: I would have thought computing the new tree hash
> at
> > the root after a commit to p proposals requires at most ~ O(p*log(n))
> actual
> > hash calls + as many memory lookups no? Let L_p be the leaves modified by
> > the proposals. The root of any subtree with no leaf in L_p has the same
> tree
> > hash as before the commit. So by storing tree hashes as we compute them
> we
> > can replace computing unchanged subtree tree hashes with a lookup. What
> am I
> > missing here?
> >
> > - Joël
> >
> >
> >>> On 1. Jan 2021, at 23:10, Joel Alwen <jalwen@wickr.com
> > <mailto:jalwen@wickr.com>
> >>> <mailto:jalwen@wickr.com <mailto:jalwen@wickr.com>>> wrote:
> >>>
> >>> Servus Raphael,
> >>>
> >>>> Happy New Year
> >>>
> >>> Same to you! Here's hoping the new one will be a tad happier than the
> > last. ;-)
> >>>
> >>> As for why "resolutions": The impetus to look for a new approach after
> >>> our initial proposal came from a comment in your email on the Fri, 23
> >>> October
> > 2020
> >>> 18:23 UTC in the thread "Weak vs. Strong Tree Authentication". There
> you
> > wrote:
> >>>
> >>>> Going for fully hashing the tree nodes as described locks down the
> >>>> whole tree structure, making any reshuffling impossible. Maybe there
> is
> >>>> a more lightweight approach that would yield the same result without
> >>>> “provid[ing] irrefutable evidence that the signing key's owner was in
> a
> >>>> group with (at least some) of the other members in the group”, but I
> >>>> can’t think of anything right now.
> >>>
> >>> That was an intriguing challenge and got us thinking. The result is the
> > current
> >>> Parent Hash definition using resolutions. We realized that to buy
> wiggle
> > room in
> >>> not locking down as much of the tree (especially, not locking down
> other
> > members
> >>> identities into the hash) we could pay a "price" that actually doesn't
> > harm the
> >>> security we've been aiming for. Namely, we can allow the adversary some
> >>> more tree manipulation but still only benign ones that dont actually
> >>> allow her to break the security property we've been aiming for.
> >>
> >> Just for context, what I had in mind was something that was not as
> >> detrimental to the membership deniability. I think in that respect both
> >> the
> > “resolution” and
> >> the “descendants” approach are equivalent. That being said, this looks
> >> quite elegant!
> >>
> >>> In retrospect, a second reason to use resolutions is efficiency (though
> > to fair,
> >>> that was really a coincidence rather than an explicit goal of the
> >>> redesign). Initially, we had been thinking of simply having parent hash
> >>> include tree
> > hash
> >>> as in the Benjamin-Karthik (BK) proposal of days yore. But we realized
> >>> that since then MLS had introduced Unmerged Leaves which meant that the
> >>> tree
> > hash of
> >>> some nodes can change between when they are assigned their HPKE key and
> >>> later when a new member wants to verify the parent hash of the node. In
> >>> other words the BK approach was broken by unmerged leaves. So, to make
> >>> the "include the whole subtree in parent hash" approach work, it seems
> >>> new members would, for each node being verified, traverse the nodes
> full
> >>> sub-tree so as to weed out those nodes that were added after the
> >>> to-be-verified node got its HPKE
> > key. (I
> >>> believe this is the O(n) time that Hubert is referring too?) In
> contrast,
> >>> the new version of parent hash (using resolutions) has the potential to
> >>> be much faster; it requires hashing less data but also visiting less of
> >>> the tree.
> > E.g.
> >>> when verifying near the root the difference can be as much as visiting
> > order n
> >>> nodes vs. visiting 1 node in the new construction.
> >>>
> >>> The more blanks & unmerged leaves in the tree the more the new
> >>> construction collapses to the old one. But only in the most extreme
> case
> >>> do they actually reach parity and the new approach is never slower than
> >>> the old.
> >>>
> >>> Needless to say, we didn't implement this stuff though, let alone
> > benchmark the
> >>> 2 approaches so any hard data elucidating what the actual difference is
> > would be
> >>> very welcome. Still, the new approach can only be as fast or faster
> (but
> > never
> >>> slower) than the new one. Nor do I think it's any more complicated to
> > define or
> >>> implement. And from a security perspective they both let MLS meet our
> > security
> >>> definition. So regardless of how small the speed up is (and ignoring
> the
> > locking
> >>> in the tree discussion above) it's still not clear to me why we'd want
> >>> to
> > switch
> >>> back.
> >>>
> >>> A meaningful attack that only works on the new but not old construction
> > would be
> >>> one very quick way of changing my mind though! :-) But, I believe such
> >>> an
> > attack
> >>> would require either new adversarial capabilities and/or break a new
> >>> security property we didn't consider. (For the adversaries & security
> we
> >>> considered, there is no security loss with the new approach compared to
> >>> the old one.)
> >>
> >> I think we are now fully aligned on the definitions and the performance
> >> implications and now that we have better documentation we can continue
> > with the
> >> interim effort.
> >>
> >> I’d however like to leave it as an open question whether we shouldn’t
> > reconsider
> >> locking down entire subtrees as a security-in-depth measure if we can
> >> convince ourselves the performance toll is not too high. For example,
> I’m
> >> wondering if there isn’t a clever way to combine the tree
> > hash
> >> and the parent hash. The tree hash has to be calculated for every epoch
> >> change anyway and it’s always in O(n).
> >>
> >> Thanks
> >>
> >> Raphael
> >>
> >>>
> >>> - Joël
> >>>
> >>> On 01/01/2021 17:30, Raphael Robert wrote:
> >>>> Thanks for the lengthy explanation!
> >>>>
> >>>> I’m glad we are on the same page regarding the definition of the
> > resolution.
> >>>> I think I understand where the misunderstanding is coming from.
> >>>>
> >>>> Initially you proposed a “strong hash” solution on the ML (on Oct 23
> >>>> 2020) that was the following:
> >>>>
> >>>> "strong parent_hash" : Define parent_hash to include tree_hash (and
> >>>> tree_hash to *not* include parent_hash). The tree_hash of a node
> covers
> >>>> all values in the subtree rooted at that node  (except the
> >>>> parent_hashes and signatures). So basically something like this: -
> >>>> leaf.tree_hash := H(leaf.data) - node.tree_hash := H(node.data,
> >>>> node.lchild.tree_hash, node.rchild.tree_hash) - root.parent_hash := 0
> -
> >>>> node.parent_hash := H(node.parent.tree_hash, node.parent.parent_hash).
> >>>>
> >>>> This definition did include the whole subtree and therefore I was
> >>>> assuming that this would also end up in the spec change. As far as I
> >>>> now understand, you went for a more lightweight approach where you say
> >>>> that just using the resolution is enough, because pinning all nodes in
> >>>> the subtree is not necessary and would only prevent “benign” attacks.
> I
> >>>> guess I missed the
> > part
> >>>> where you switched from one concept to the other.
> >>>>
> >>>> What I discovered during implementation is that when you only use the
> >>>> resolution approach and the tree has no blanks, you are not always
> able
> >>>> to detect if the attacker has swapped two leaves. This might indeed
> >>>> just be a benign attack in the sense that the tree invariant is not
> >>>> violated.
> >>>>
> >>>> At this point I don’t have strong arguments for or against using the
> >>>> resolution approach, I would have to think about it some more. Just as
> >>>> a reminder for context, the original concept by Benjamin and Karthik
> >>>> was “parent hash covers the tree hash”. That’s pretty much what you
> >>>> initially proposed, except that it also covered the KeyPackage at the
> >>>> leaf level, not just the HPKE public key.
> >>>>
> >>>> My question right now would be: What is the benefit of the lightweight
> >>>>  resolution-based approach? If it’s just efficiency, we should look at
> > whether
> >>>> there’s really that much to gain.
> >>>>
> >>>> On efficiency: The parent hashes are verified in two instances: a)
> when
> >>>> a new member joins the group and validates the public tree it received
> >>>> and b) when a member receives a full Commit from another member.
> >>>>
> >>>> Re a): The spec now says "Moreover, when joining a group, new members
> >>>> MUST authenticate each non-blank parent node P. A parent node P is
> > authenticated”.
> >>>> This means the effort is linear to start with. Depending on how you
> >>>> define the operation, it is even worse than linear. Say the tree has n
> >>>> leaves and therefore 2n-1 nodes. For every node, we need to iterate
> >>>> over at least a
> > part
> >>>> of the subtree and collect public keys that we then concatenate. We
> end
> >>>> up only hashing once per node, but with a variable size payload. I
> >>>> think the cost of navigating the subtree and collecting the keys is
> >>>> O(log n) for the “resolution approach and O(n) for the “descendants"
> >>>> approach and the cost of hashing is linearly proportional to the size
> >>>> of the payload. If we assume that hashing is the most expensive
> >>>> operation, the total cost would be O(n * log n) for “resolution” and
> >>>> O(n ^ 2) for “descendants".
> >>>>
> >>>> Re b): It is very similar to a), except that we only have to iterate
> >>>> over log n nodes of the direct path, not the whole tree. That means
> >>>> that the total effort is O(log n * log n) for “resolution” and O(n *
> >>>> log n) for “descendants”.
> >>>>
> >>>> There’s clearly a difference in terms of efficiency between the two
> >>>> approaches. That being said, we should keep in mind that for small
> > payloads,
> >>>> hashing is a lot faster than other cryptographic operations like e.g.
> >>>> producing and verifying signatures. Since a new joiner needs to also
> >>>> verify n signatures when joining a group, it could well be that the
> >>>> effort of verifying the parent hashes is negligible compared to the
> >>>> cost of verifying all signatures.
> >>>>
> >>>> In summary: I still don’t have strong arguments against the
> >>>> “resolution” approach, but I also have a feeling that the efficiency
> >>>> differences are not dramatic either.
> >>>>
> >>>> Happy New Year and thanks!
> >>>>
> >>>> Raphael
> >>>>
> >>>>> On 1. Jan 2021, at 15:48, Mularczyk Marta
> >>>>> <marta.mularczyk@inf.ethz.ch
> > <mailto:marta.mularczyk@inf.ethz.ch>
> >>>>> <mailto:marta.mularczyk@inf.ethz.ch
> >>>>> <mailto:marta.mularczyk@inf.ethz.ch>>
> >>>>> <mailto:marta.mularczyk@inf.ethz.ch
> > <mailto:marta.mularczyk@inf.ethz.ch> <mailto:marta.mularczyk@inf.ethz.ch
> > <mailto:marta.mularczyk@inf.ethz.ch>>>>
> >>>>> wrote:
> >>>>>
> >>>>> Hi Raphael,
> >>>>>
> >>>>> Thanks for explaining! Using "descendants" is definitely better for
> >>>>> security, but also, as Hubert mentioned, less efficient.
> >>>>>
> >>>>> Also we did understand “resolution” the same way as you when we
> >>>>> proposed the strong parent hash (sorry for the confusion).
> >>>>>
> >>>>> But we obviously really want to understand if/why using resolution is
> >>>>>  insecure. Ultimately, the security property we care about is that
> >>>>> privacy and authenticity of application messages holds. The Parent
> >>>>> Hash mechanism helps us ensure this holds also for epochs in
> >>>>> adversarially generated groups, as long as the adversary does not
> >>>>> control any leaves in this epoch (i.e. does not know the signing key
> >>>>> of the leaf). So to really see if your examples break security, it
> >>>>> would be good to see if they let us violate this security goal.
> >>>>>
> >>>>> To see why we believe the new parent hash notion (i.e. with
> >>>>> resolutions) achieves this, let us first clarify what technical
> >>>>> property about (even adversarially produced) ratchet trees we want
> >>>>> from verifying parent hashes. It’s not that the adversary can’t
> >>>>> produce manipulated trees at all. It’s that the only manipulations
> >>>>> she is able to do will not let her produce a tree that violates the
> >>>>> tree invariant. That’s because (in our model) when the tree invariant
> >>>>> holds then we can show that the epoch is secure.
> >>>>>
> >>>>> So some manipulations *are* possible. They are just “benign”. For
> >>>>> example, suppose adversary A controls a leaf L with HPKE key pk in
> >>>>> tree T (i.e. she knows the signing key of L’s owner). Then A can
> >>>>> construct a new tree T’ identical to T but where leaf L has pk’ != pk
> >>>>> (say, by performing a commit at leaf L to obtain T’). But for this,
> >>>>> in T’ leaf L must still be controlled by A. We call this manipulation
> >>>>> “benign” because even though A might now break privacy of application
> >>>>> messages in epoch with tree T’ it doesnt violate our security goal
> >>>>> because privacy is conditioned on A *not* controlling any leaves.
> >>>>>
> >>>>> Conversely, one thing A can *not* do is the following. Say T’ has a
> >>>>> node V with HPKE public key pk0 such that A knows corresponding sk0.
> >>>>> Then for T’ to pass verification, it must contain a leaf controlled
> >>>>> by A. This holds since for T’ to pass verification there must be a
> >>>>> leaf in V’s subtree with a signature covering a hash that includes
> >>>>> pk0. I.e. some leaf must “attest” to having inserted pk0 into T’.
> >>>>> (Recall that in our model the adversary can’t just leak sk0 alone.
> >>>>> They must leak a full state of some client which also means leaking
> >>>>> the client's signing key.)
> >>>>>
> >>>>> So our question to you is the following. (Does this make sense? And)
> >>>>> we’re not exactly sure what you have in mind when you say “swapping
> >>>>> out two leaf nodes”. But we think it would only be a problem if it is
> >>>>> not benign. Is this true for your attack? Could you expand a bit on
> >>>>> what tree manipulation you have in mind?
> >>>>>
> >>>>> Best, Joël& Marta
> >>>>>
> >>>>>
> >
> --------------------------------------------------------------------------------
> >
> >
> >>>>
> >>>>>
> >>>>>
> >>> *From:* MLS <mls-bounces@ietf.org <mailto:mls-bounces@ietf.org>
> >>> <mailto:mls-bounces@ietf.org
> > <mailto:mls-bounces@ietf.org>> <mailto:mls-bounces@ietf.org
> > <mailto:mls-bounces@ietf.org>
> >>> <mailto:mls-bounces@ietf.org <mailto:mls-bounces@ietf.org>>>> on
> behalf
> >>> of
> >>>>> Hubert Chathi <hubertc@matrix.org <mailto:hubertc@matrix.org>
> >>>>> <mailto:hubertc@matrix.org
> > <mailto:hubertc@matrix.org>> <mailto:hubertc@matrix.org
> > <mailto:hubertc@matrix.org>
> >>>>> <mailto:hubertc@matrix.org <mailto:hubertc@matrix.org>>>> *Sent:*
> >>>>> Thursday, December 31, 2020 5:36:53 PM *To:* mls@ietf.org
> > <mailto:mls@ietf.org> <mailto:mls@ietf.org <mailto:mls@ietf.org>>
> >>>>> <mailto:mls@ietf.org <mailto:mls@ietf.org> <mailto:mls@ietf.org
> > <mailto:mls@ietf.org>>> *Subject:* Re: [MLS] New Parent Hash
> >>>>>
> >>>>> I think that if you use "descendants" as you define it, then
> >>>>> calculating the parent_hash will become O(n) all the time, rather
> >>>>> than O(log n) in the best-case, since the size of the descendants
> >>>>> list will be O(n).  I'm not sure if there's a good way to maintain
> >>>>> O(log n) and still provide the desired security.
> >>>>>
> >>>>> On Thu, 31 Dec 2020, at 06:47, Raphael Robert wrote:
> >>>>>> Hi Joël, Daniel and Marta,
> >>>>>>
> >>>>>> I just implemented the stronger parent hash scheme you proposed in
> >>>>>> the way it is described in the spec now. The good news is that it
> >>>>>> is implementable, the bad news is that is does not solve the
> >>>>>> problem you initially describe. Specifically, swapping out two leaf
> >>>>>> nodes still goes undetected.
> >>>>>>
> >>>>>> I think it is due to a terminology conflict, but it would be great
> >>>>>> if you could validate that assumption. In 7.4 you introduce the
> >>>>>> notion of “original child resolution” as " the array of
> >>>>>> HPKEPublicKey values of the nodes in the resolution of S”. The term
> >>>>>> “resolution” is defined in 5.2. It is essentially a way of dealing
> >>>>>> with blank nodes by substituting them with the first non-blank
> >>>>>> descendants you find when descending the subtree of a node. Here is
> >>>>>> my assumption: You probably thought the resolution would cover all
> >>>>>> non-blank descendants of a node, not just the first non-blank
> >>>>>> descendant. In that case we would need another function that does
> >>>>>> that (let’s call it “descendants”).
> >>>>>>
> >>>>>> If we look at the example of 5.2:
> >>>>>>
> >>>>>> - The resolution of node 3 is the list [A, CD, C] - The
> >>>>>> descendants of node 3 is the list [A, CD, C, D]
> >>>>>>
> >>>>>> - The resolution of node 5 is the list [CD] - The descendants of
> >>>>>> node 5 is the list [CD, C, D] (we just need to additionally filter
> >>>>>> out C, because it is an unmerged leaf of node 5)
> >>>>>>
> >>>>>> Here is a code snippet of what the “descendants” function would
> >>>>>> look like (sorry that it’s Rust, not Python):
> >>>>>>
> >>>>>> fn descendants(x: NodeIndex, size: LeafIndex) -> Vec<NodeIndex> {
> >>>>>> if level(x) == 0 { vec![x] } else { let left_child = left(x); let
> >>>>>> right_child = right(x, size); [ descendants(left_child, size),
> >>>>>> vec![x], descendants(right_child, size), ] .concat() } }
> >>>>>>
> >>>>>> There’s a more efficient implementation of course, this is just to
> >>>>>>  illustrate.
> >>>>>>
> >>>>>> In other words, the new “original child resolution” of S would be
> >>>>>> the subtree under S, but with filtered-out blank nodes and unmerged
> >>>>>> leaves of S.
> >>>>>>
> >>>>>> If my assumption is correct, I’ll follow up quickly with a PR since
> >>>>>> this is currently a blocker for interop based on draft-11.
> >>>>>>
> >>>>>> Thanks
> >>>>>>
> >>>>>> Raphael
> >>>>>>
> >>>>>>
> >>>>>>
> >>>>>>> On 12. Nov 2020, at 22:14, Joel Alwen <jalwen@wickr.com
> > <mailto:jalwen@wickr.com>
> >>>>>>> <mailto:jalwen@wickr.com <mailto:jalwen@wickr.com>>
> >>>>>>> <mailto:jalwen@wickr.com <mailto:jalwen@wickr.com>
> > <mailto:jalwen@wickr.com <mailto:jalwen@wickr.com>>>> wrote:
> >>>>>>>
> >>>>>>> Hey people,
> >>>>>>>
> >>>>>>> Just a quick heads up that Daniel, Marta and I put in a PR with a
> >>>>>>> new version of strong parent hash to prevent the attacks from the
> >>>>>>> earlier thread. We tried to make it as minimal as we could to
> >>>>>>> help with deniability while still preventing the attacks
> >>>>>>> permitted by weak parent hashes.
> >>>>>>>
> >>>>>>> In a nutshell, when computing parent_hash at node v with parent p
> >>>>>>> and sibling w we include - p's HPKE pubkey - p's parent_hash
> >>>>>>> value and - HPKE pub keys in the resolution of w except for those
> >>>>>>> belonging to leaves unmerged at p.
> >>>>>>>
> >>>>>>> As a sanity check, notice that as long as p's keys remain the
> >>>>>>> same one can always recompute the same parent_hash value at v as
> >>>>>>> was initially computed by the member that set p's keys. (In other
> >>>>>>> words, new members can verify that the stored parent_hash values
> >>>>>>> match whats in the tree.) In particular, that's coz as long as
> >>>>>>> p's keys are unchanged so is the resolution of w. The only
> >>>>>>> exception are leaves being added as unmerged at w. But those
> >>>>>>> leaves are also added as unmerged at p so they are left out of
> >>>>>>> the hash.
> >>>>>>>
> >>>>>>> As for deniability, at least parent_hash only binds HPKE keys and
> >>>>>>>  nothing else (like, say, credentials or signatures). Its the
> >>>>>>> best we were able to come up with for now...
> >>>>>>>
> >>>>>>> - joël
> >>>>>>>
> >>>>>>> _______________________________________________ MLS mailing list
> >>>>>>>  MLS@ietf.org <mailto:MLS@ietf.org> <mailto:MLS@ietf.org
> > <mailto:MLS@ietf.org>> <mailto:MLS@ietf.org <mailto:MLS@ietf.org>
> >>>>>>> <mailto:MLS@ietf.org <mailto:MLS@ietf.org>>>
> >>>>>>> https://www.ietf.org/mailman/listinfo/mls
> > <https://www.ietf.org/mailman/listinfo/mls>
> >>>>>>> <https://www.ietf.org/mailman/listinfo/mls
> > <https://www.ietf.org/mailman/listinfo/mls>>
> >>>>> <https://www.ietf.org/mailman/listinfo/mls
> > <https://www.ietf.org/mailman/listinfo/mls>
> >>>>> <https://www.ietf.org/mailman/listinfo/mls
> > <https://www.ietf.org/mailman/listinfo/mls>>>
> >>>>>>
> >>>>>> _______________________________________________ MLS mailing list
> >>>>>> MLS@ietf.org <mailto:MLS@ietf.org> <mailto:MLS@ietf.org
> > <mailto:MLS@ietf.org>> <mailto:MLS@ietf.org <mailto:MLS@ietf.org>
> > <mailto:MLS@ietf.org <mailto:MLS@ietf.org>>>
> >>>>>> https://www.ietf.org/mailman/listinfo/mls
> > <https://www.ietf.org/mailman/listinfo/mls>
> >>>>>> <https://www.ietf.org/mailman/listinfo/mls
> > <https://www.ietf.org/mailman/listinfo/mls>>
> >>>>> <https://www.ietf.org/mailman/listinfo/mls
> > <https://www.ietf.org/mailman/listinfo/mls>
> >>>>> <https://www.ietf.org/mailman/listinfo/mls
> > <https://www.ietf.org/mailman/listinfo/mls>>>
> >>>>>>
> >>>>>
> >>>>> _______________________________________________ MLS mailing list
> >>>>> MLS@ietf.org <mailto:MLS@ietf.org> <mailto:MLS@ietf.org
> > <mailto:MLS@ietf.org>> <mailto:MLS@ietf.org <mailto:MLS@ietf.org>
> > <mailto:MLS@ietf.org <mailto:MLS@ietf.org>>>
> >>>>> https://www.ietf.org/mailman/listinfo/mls
> > <https://www.ietf.org/mailman/listinfo/mls>
> >>>>> <https://www.ietf.org/mailman/listinfo/mls
> > <https://www.ietf.org/mailman/listinfo/mls>>
> >>>>> <https://www.ietf.org/mailman/listinfo/mls
> > <https://www.ietf.org/mailman/listinfo/mls>
> >>>>> <https://www.ietf.org/mailman/listinfo/mls
> > <https://www.ietf.org/mailman/listinfo/mls>>>
> >>
> >
> > _______________________________________________ MLS mailing list
> MLS@ietf.org
> > <mailto:MLS@ietf.org> https://www.ietf.org/mailman/listinfo/mls
> > <https://www.ietf.org/mailman/listinfo/mls>
> >
>