Re: [MLS] Simplifying the key schedule

Joel Alwen <jalwen@wickr.com> Sun, 24 February 2019 05:17 UTC

Return-Path: <jalwen@wickr.com>
X-Original-To: mls@ietfa.amsl.com
Delivered-To: mls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 34DE1130E68 for <mls@ietfa.amsl.com>; Sat, 23 Feb 2019 21:17:24 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.902
X-Spam-Level:
X-Spam-Status: No, score=-1.902 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=wickr-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qABIa1WTVBHV for <mls@ietfa.amsl.com>; Sat, 23 Feb 2019 21:17:21 -0800 (PST)
Received: from mail-pg1-x529.google.com (mail-pg1-x529.google.com [IPv6:2607:f8b0:4864:20::529]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 02228130E0A for <mls@ietf.org>; Sat, 23 Feb 2019 21:17:20 -0800 (PST)
Received: by mail-pg1-x529.google.com with SMTP id 196so2944037pgf.13 for <mls@ietf.org>; Sat, 23 Feb 2019 21:17:20 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wickr-com.20150623.gappssmtp.com; s=20150623; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=pCHz7A3jICDM7irncqhN93xNdlJ+mcU61AkKqV1+XgY=; b=KK31lK8rsfqNd2PV+QpnO/z69kKz/0RKnRB3Qnq6WvbUoToHcGr0K7lZAPTb/fOgKZ EitLSope4e+0152za4QrwUS55MbtViKV7Wi/WigXeAIyg2Ui8yBDpLaWeeLHnHBHoGYZ aUfSpXn51hF4iRGEYA/9eax8OR/BMieh1Jg6Q1P3sqoVNV+TSIbkJZsnkxJbl1EE5X0k ZddkX3YFVbA8wTX3gVjkHrAQoIogfDbB6ZtcdHiVQOTz2oca40VWDb1iU2HJBEL++5zv qjhR1k0XCq82D6rW13qwLMZ324EfR2OWHSeXPOUbfuq/PltqvYqAFPwoTEfloRdIMb5D GxsA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=pCHz7A3jICDM7irncqhN93xNdlJ+mcU61AkKqV1+XgY=; b=G4QKTGS/XENPQxV8XPO2GZ/A8ax4Da8zMtKjRNqqZ7+UNkTZA5fuA2o/e9q7QsvKYX AbKMC0ytFMfgk/wdbnIFn2lquuCF0+wumgPNLf+CtPomaoZ4ppGPlXOxm9L8wLOcDC1O x/L7xnnPDpodS8Jvfbs63q4PKsyYoOyQ8L2a5mN+ab6x35lPUqCKiYSyoSQcf+kc72/k JXD9E3Zfo2Z4cq/JvddDLayai9t3HOb1QYesVaUSBQ2mfaWPetAUXt6FZjlk6GSvfMYE DiCNrql21u+x+lbIILi1hDV3XOt89Pr6gvDCj1RilPMTgTXMQehLP0vKW8b2WsWvB5zT JNVw==
X-Gm-Message-State: AHQUAubdYUbQjIECUwrJ6ppnBJoaUfK3f8gp+Yem2ncL5HOkhCVBghki upr/b2DNmI04LvJoMJwIUyj9511vmIkmWA==
X-Google-Smtp-Source: AHgI3IYwVA2Xd1RLesFUrVoPxDOFaLyMvgmLIujIzz4HDqz01+yPhRGyyL/cB0peeMau9SDmSvf3Vw==
X-Received: by 2002:a63:ed42:: with SMTP id m2mr11696442pgk.147.1550985439705; Sat, 23 Feb 2019 21:17:19 -0800 (PST)
Received: from [192.168.2.182] (mx-ll-180.183.72-41.dynamic.3bb.co.th. [180.183.72.41]) by smtp.gmail.com with ESMTPSA id m64sm14708428pfi.149.2019.02.23.21.17.18 for <mls@ietf.org> (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sat, 23 Feb 2019 21:17:18 -0800 (PST)
To: mls@ietf.org
References: <CAL02cgT_HiggpB6KDnXyqzyz5WUr=dAodmaRSPt+PfWj3BM3Mg@mail.gmail.com>
From: Joel Alwen <jalwen@wickr.com>
Message-ID: <88df2d10-8ffa-11ce-64a9-be2da501892a@wickr.com>
Date: Sun, 24 Feb 2019 12:17:16 +0700
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.4.0
MIME-Version: 1.0
In-Reply-To: <CAL02cgT_HiggpB6KDnXyqzyz5WUr=dAodmaRSPt+PfWj3BM3Mg@mail.gmail.com>
Content-Type: text/plain; charset="utf-8"
Content-Language: en-GB
Content-Transfer-Encoding: 8bit
Archived-At: <https://mailarchive.ietf.org/arch/msg/mls/X0QihaNEbnZiHE1gD3kVJY2BgW0>
Subject: Re: [MLS] Simplifying the key schedule
X-BeenThere: mls@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Messaging Layer Security <mls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mls>, <mailto:mls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mls/>
List-Post: <mailto:mls@ietf.org>
List-Help: <mailto:mls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mls>, <mailto:mls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 24 Feb 2019 05:17:24 -0000

Hi Richard,

Sandro Corretti and I just discussed the questions you brought up in
your email about the key schedule and we've come round to the opinion
that of the options you present we prefer:

- for the roster a merkle tree seems to be a good solution
- for the ratchet tree an ampelmann hashing scheme should work well

In our opinion linear state size (with reasonably small constants) is
the lesser of the three evils when compared to needing extensive hashing
for each tree update and needing larger packet sizes for the protocol
(e.g. due to membership proofs).

Having said that I wanted to suggest an alternative. Namely, including
the ID's of leaves as part of what gets hashed for the ratchet tree.
This makes including the roster redundant which means we would save on
all the merkle tree storage as well as the hashing during join/leave
ops. The only cost I see is that we might want to include the *hash* of
ID's as part of the state to avoid having to recompute when that party
updates their PK. (But that still needs less storage than a full merkle
tree on the roster so its also an improvement.)




While we are on the subject of amplemann tree's I wanted to make a
suggestion for how to deal with blank nodes. I think its conceptually
simpler (and thus hopefully also less error prone / easier to code) then
using the resolution of blank nodes to define their values in the
ampelmann hash. We let the value of each node begin with an indicator
bit b (or octet if thats more practical). So node values are strings of
the form b || PK. (Alternatively for leaves it would be b || PK ||
H(ID).) When b indicates the node is blank (b=1) we simply define PK
(and H(ID)) to be all 0s.

I'm also in favour of Raphael's suggestion to define the hash of a node
to be H(L || R || V) where L and R are the hash of the left and right
child while V denotes the value of the current node. It seems simpler
than H(H(L || R) || V).

- Joel


On 2/23/19 10:27 AM, Richard Barnes wrote:
> At the last interim, we filed an issue to discuss simplifying the key
> schedule.  In particular, folks were unhappy that the whole, possibly
> gigantic GroupState object has to be hashed on every epoch change.  When
> I sat down to think about how to simplify, though, it wasn't clear what
> direction to go in, because it wasn't clear exactly what people were
> finding painful.
> 
> I'll posit from the start that I'm pretty sure that we need to somehow
> include the tree, the roster, and the message transcript into the key
> schedule.  The transcript inclusion follows from the same reasons as it
> does in TLS; the roster and tree are needed because the whole transcript
> isn't accessible to a new joiner (so the new joiner can't infer those
> values from the transcript).
> 
> With that said, there are multiple ways to represent those values for
> inclusion the key schedule:
> 
> 1. Directly (as now)
> 2. As hashes
> 3. As structured hashes, e.g., a Merkle or "ampelmann" tree
> 
> People are clearly unhappy with (1).  The difference between 2 and 3
> (and between variants of 3) is what value we're trying to achieve.  (2)
> reduces the amount of hashing somewhat, but it doesn't change the amount
> of state you have to keep -- each member has to keep the full roster and
> tree, and hash the whole thing when it changes.
> 
> For (3), it seems like there are weak and strong forms..  In the weak
> form, we make it so that the member has to hash a sub-linear-size amount
> of data to update its representation of the tree/roster when it changes
> (e.g., in a Merkle tree, you have to recompute log(N) nodes)..  But each
> member still has to have the whole structure.  In the strong form, we
> arrange the protocol so that a member can operate while holding a
> sub-linear amount of state (e.g., just a copath/frontier in some tree). 
> Note that the strong form would entail pretty major protocol changes,
> e.g., bringing back the Merkle membership proofs that were in earlier
> versions of the protocol.  For similar reasons, this would also entail
> larger messages.
> 
> So basically, the mapping of options to values is:
> 
> 1. Overall simplicity
> 2. A smaller coefficient on the linear amount of data hashed
> 3(weak). Sub-linear amount of data hashed
> 3(strong). Sub-linear amount of state
> 
> In my personal value judgement, I would probably end up somewhere
> between (2) and (3,weak).  It seems pretty unavoidable to me that
> members will have *some* linear-size state for the group, even if it's
> only a names for the members; adding say a key and a roster entry for
> each of those doesn't seem like a huge increment.  So appealing though
> the state savings would be, they're not compelling enough to merit the
> protocol complexity and overhead.
> 
> All that said, I would be interested in other folks' thoughts on which
> options are to be preferred here, or if there are other options that
> come to mind.
> 
> Thanks,
> --Richard
> 
> 
> _______________________________________________
> MLS mailing list
> MLS@ietf.org
> https://www.ietf.org/mailman/listinfo/mls
>