Re: [MLS] Simplifying the key schedule

Richard Barnes <rlb@ipv.sx> Mon, 25 February 2019 15:32 UTC

Return-Path: <rlb@ipv.sx>
X-Original-To: mls@ietfa.amsl.com
Delivered-To: mls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id EEA091200D8 for <mls@ietfa.amsl.com>; Mon, 25 Feb 2019 07:32:47 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=ipv-sx.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gkPf5mIaIem9 for <mls@ietfa.amsl.com>; Mon, 25 Feb 2019 07:32:44 -0800 (PST)
Received: from mail-ot1-x32c.google.com (mail-ot1-x32c.google.com [IPv6:2607:f8b0:4864:20::32c]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8572E130F30 for <mls@ietf.org>; Mon, 25 Feb 2019 07:32:18 -0800 (PST)
Received: by mail-ot1-x32c.google.com with SMTP id g1so8131341otj.11 for <mls@ietf.org>; Mon, 25 Feb 2019 07:32:18 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ipv-sx.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=VAddBZZbImz3JLxgVHeN2PfDLjUm0OLg6AgMvhkiGm8=; b=Z2lyPw6ctrlMzoL0dqU3V61VrZu4NmtMXv+jvdKgPDzOoUYjI+c4E0bgjYZgRL8MF2 vSTd8sPcPw4YZyqQsQWlxSy7Fogd97nMgfIOTzBv98KbvYa1gSZudl5JRy9XKWMiCPkn pxdMbrFbn6kbSMuvHkMOnCb9ynwYpVqtQH5bosforIqWbeZVEbXHWjs+e+ZfZHvNiKCG yYntFZ8SdWiaIkuZK9oc+AV3GpauoHvV6W4Qm8QBYL2rwtxIs77I6O7zvZYiAYx4AF3e GYxsCaNvOZvY6hw1Kxf8OK6MxCnXiQHYADIbVCsQzUQJQ7jlXnD44hzdD9XF7CGKtRGO ZfKQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=VAddBZZbImz3JLxgVHeN2PfDLjUm0OLg6AgMvhkiGm8=; b=P3jZ8Cdm+Hrohb4tDazGCEU1vF06aH8IOqF6g1lqG4Px3DTSSoGykr6c41sMMDFtSJ RnlDn4/e1rMKTn3qrPW+u8z13teWXFATUJ2XchDxt0LGOCC9faUHXfQ3m3s8Zt366sJo cDWhoeeiJjfiDQU5YQM/uKGOou/Ip8hYu14DDwpj4HwHxWv99Xc47uyTpnG6qIUpB46D 44hIWkSstnIRL2baxwEOQ7Z6P3Uo4RAX+8+LOZkka2yTaPtse1HdXFWtsUqMgq8g3jJ4 QO1X47Vwe3qHAnV35jkFfPbeSzKKPBBM+vD7LFA2iY6Dgfmjz7kcyf1Cxh/kODtUVnQR a5og==
X-Gm-Message-State: AHQUAuaDO8kyrtAVjgDVXTe5SGhgakIE3Nl0eiYghC2H14T6dsruAJSc 5BRbAOB/FEbUaET0zMBgApJ75/pURLtitTghh0rRO9eK
X-Google-Smtp-Source: AHgI3IYnNOoZiXfn+fZENpMRGP6n4MOWXNrXBWii01/S0N7XkMPeU9ocFs3XxlL0Ok9f69E/xOg/t/aYM8iAx5zQXe0=
X-Received: by 2002:a05:6830:15c7:: with SMTP id j7mr3542721otr.331.1551108737621; Mon, 25 Feb 2019 07:32:17 -0800 (PST)
MIME-Version: 1.0
References: <CAL02cgT_HiggpB6KDnXyqzyz5WUr=dAodmaRSPt+PfWj3BM3Mg@mail.gmail.com> <88df2d10-8ffa-11ce-64a9-be2da501892a@wickr.com>
In-Reply-To: <88df2d10-8ffa-11ce-64a9-be2da501892a@wickr.com>
From: Richard Barnes <rlb@ipv.sx>
Date: Mon, 25 Feb 2019 10:32:00 -0500
Message-ID: <CAL02cgQfrE_=msKCDaBpSf1twEEtfgJrqgYSNzGJ=yYfczAg-w@mail.gmail.com>
To: Joel Alwen <jalwen@wickr.com>
Cc: Messaging Layer Security WG <mls@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000005071f10582b9a5b6"
Archived-At: <https://mailarchive.ietf.org/arch/msg/mls/wQG2Vib23uPsBjHw1st54riRwRI>
Subject: Re: [MLS] Simplifying the key schedule
X-BeenThere: mls@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Messaging Layer Security <mls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mls>, <mailto:mls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mls/>
List-Post: <mailto:mls@ietf.org>
List-Help: <mailto:mls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mls>, <mailto:mls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 25 Feb 2019 15:32:48 -0000

Hi Joel,

Some notes inline:

On Sun, Feb 24, 2019 at 12:17 AM Joel Alwen <jalwen@wickr.com> wrote:

> Hi Richard,
>
> Sandro Corretti and I just discussed the questions you brought up in
> your email about the key schedule and we've come round to the opinion
> that of the options you present we prefer:
>
> - for the roster a merkle tree seems to be a good solution
> - for the ratchet tree an ampelmann hashing scheme should work well
>
> In our opinion linear state size (with reasonably small constants) is
> the lesser of the three evils when compared to needing extensive hashing
> for each tree update and needing larger packet sizes for the protocol
> (e.g. due to membership proofs).
>

Thanks for the clear priorities!


> Having said that I wanted to suggest an alternative. Namely, including
> the ID's of leaves as part of what gets hashed for the ratchet tree.
> This makes including the roster redundant which means we would save on
> all the merkle tree storage as well as the hashing during join/leave
> ops. The only cost I see is that we might want to include the *hash* of
> ID's as part of the state to avoid having to recompute when that party
> updates their PK. (But that still needs less storage than a full merkle
> tree on the roster so its also an improvement.)
>

This seems sensible to me.  So you would basically end up with a unified
tree with the following shape:

- Leaves hold (DHPublicKey, Credential, NodeHash)
- Intermediate nodes hold (DHPublicKey, LeftHash, RightHash, NodeHash)

It actually looks to me like this is a wash from a storage POV -- in each
case, you have 2n-1 DH keys, n credentials, and n-1 hashes.  But it
probably saves you some hashing in some cases.



> While we are on the subject of amplemann tree's I wanted to make a
> suggestion for how to deal with blank nodes. I think its conceptually
> simpler (and thus hopefully also less error prone / easier to code) then
> using the resolution of blank nodes to define their values in the
> ampelmann hash. We let the value of each node begin with an indicator
> bit b (or octet if thats more practical). So node values are strings of
> the form b || PK. (Alternatively for leaves it would be b || PK ||
> H(ID).) When b indicates the node is blank (b=1) we simply define PK
> (and H(ID)) to be all 0s.
>

This is actually already kind of covered in the current spec.  The nodes in
the ratchet tree are of type optional<DHPublicKey>, which is 0x01 || PK
when populated and 0x00 when blank.  So if we just use that same struct for
the V input to the ampelmann hashing, things should be pretty
straightforward to code.


> I'm also in favour of Raphael's suggestion to define the hash of a node
> to be H(L || R || V) where L and R are the hash of the left and right
> child while V denotes the value of the current node. It seems simpler
> than H(H(L || R) || V).
>

The only difference that occurs to me here is that there's a factor of 2 in
proof size to be had: If you don't care about the children, you can just
send H(L || R) instead of sending (L, R).  Of course, there's a
corresponding factor of two in hashing, so there's another trade-off space
to select from.

--Richard



>
> - Joel
>
>
> On 2/23/19 10:27 AM, Richard Barnes wrote:
> > At the last interim, we filed an issue to discuss simplifying the key
> > schedule.  In particular, folks were unhappy that the whole, possibly
> > gigantic GroupState object has to be hashed on every epoch change.  When
> > I sat down to think about how to simplify, though, it wasn't clear what
> > direction to go in, because it wasn't clear exactly what people were
> > finding painful.
> >
> > I'll posit from the start that I'm pretty sure that we need to somehow
> > include the tree, the roster, and the message transcript into the key
> > schedule.  The transcript inclusion follows from the same reasons as it
> > does in TLS; the roster and tree are needed because the whole transcript
> > isn't accessible to a new joiner (so the new joiner can't infer those
> > values from the transcript).
> >
> > With that said, there are multiple ways to represent those values for
> > inclusion the key schedule:
> >
> > 1. Directly (as now)
> > 2. As hashes
> > 3. As structured hashes, e.g., a Merkle or "ampelmann" tree
> >
> > People are clearly unhappy with (1).  The difference between 2 and 3
> > (and between variants of 3) is what value we're trying to achieve.  (2)
> > reduces the amount of hashing somewhat, but it doesn't change the amount
> > of state you have to keep -- each member has to keep the full roster and
> > tree, and hash the whole thing when it changes.
> >
> > For (3), it seems like there are weak and strong forms..  In the weak
> > form, we make it so that the member has to hash a sub-linear-size amount
> > of data to update its representation of the tree/roster when it changes
> > (e.g., in a Merkle tree, you have to recompute log(N) nodes)..  But each
> > member still has to have the whole structure.  In the strong form, we
> > arrange the protocol so that a member can operate while holding a
> > sub-linear amount of state (e.g., just a copath/frontier in some tree).
> > Note that the strong form would entail pretty major protocol changes,
> > e.g., bringing back the Merkle membership proofs that were in earlier
> > versions of the protocol.  For similar reasons, this would also entail
> > larger messages.
> >
> > So basically, the mapping of options to values is:
> >
> > 1. Overall simplicity
> > 2. A smaller coefficient on the linear amount of data hashed
> > 3(weak). Sub-linear amount of data hashed
> > 3(strong). Sub-linear amount of state
> >
> > In my personal value judgement, I would probably end up somewhere
> > between (2) and (3,weak).  It seems pretty unavoidable to me that
> > members will have *some* linear-size state for the group, even if it's
> > only a names for the members; adding say a key and a roster entry for
> > each of those doesn't seem like a huge increment.  So appealing though
> > the state savings would be, they're not compelling enough to merit the
> > protocol complexity and overhead.
> >
> > All that said, I would be interested in other folks' thoughts on which
> > options are to be preferred here, or if there are other options that
> > come to mind.
> >
> > Thanks,
> > --Richard
> >
> >
> > _______________________________________________
> > MLS mailing list
> > MLS@ietf.org
> > https://www.ietf.org/mailman/listinfo/mls
> >
>
> _______________________________________________
> MLS mailing list
> MLS@ietf.org
> https://www.ietf.org/mailman/listinfo/mls
>