Re: [MLS] Unpredictable epochs?

Raphael Robert <raphael@wire.com> Fri, 03 January 2020 15:32 UTC

Return-Path: <raphael@wire.com>
X-Original-To: mls@ietfa.amsl.com
Delivered-To: mls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E6E5912004F for <mls@ietfa.amsl.com>; Fri, 3 Jan 2020 07:32:31 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Level:
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=wire-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dcCFdE218Ryd for <mls@ietfa.amsl.com>; Fri, 3 Jan 2020 07:32:28 -0800 (PST)
Received: from mail-lj1-x22b.google.com (mail-lj1-x22b.google.com [IPv6:2a00:1450:4864:20::22b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1496012000F for <mls@ietf.org>; Fri, 3 Jan 2020 07:32:28 -0800 (PST)
Received: by mail-lj1-x22b.google.com with SMTP id m26so41788217ljc.13 for <mls@ietf.org>; Fri, 03 Jan 2020 07:32:27 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=wire-com.20150623.gappssmtp.com; s=20150623; h=from:message-id:mime-version:subject:date:in-reply-to:cc:to :references; bh=sYOblLcihSPdJO0dl5yuoQ+nu8UWOlD8wSeRaJVwx4c=; b=i5xROw5kmdDusIuVEldgLfZUzkgV14s82uvlJUp4muTWWKUlyzPxkOrCtOFzu9cFR2 RIWiYC0voLoXqqSICZjo16wqrxLf48O/HYPDy+eVdurIkXh5OGEKuKkY7K4EC2/AgY/B ZW1gjmZHpjqE1S51NA89MJhq6FC+DyrhB1ZZYzrGOwDxq/5Tg1yfhHG4vpVTAsRmgdk9 hPQ2l9aX5c5J14ns6x9/S0ZCQWyAao4UYDGnTGcM5+XkCFLxh1O8AQfmq3OP7dYP0l/5 EbVdUUqPp+rhSxF54zlQg7q7AtbLoa84DvnMujOXFU/ZeCCReuMbK4q5uXMbNtDx6bXd ycfw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:message-id:mime-version:subject:date :in-reply-to:cc:to:references; bh=sYOblLcihSPdJO0dl5yuoQ+nu8UWOlD8wSeRaJVwx4c=; b=d35fsA8YBhHziotmjl1rPBHQRA1zJGH8OPW+lEfOr5G7rt9KcnXxuBJuvhzD+A6Ifn EzTOYR0CWadIHCo+vK+TSm8DKLKCg6ZI9kHN/VddA7JJ6BiCd2/0JP/LZvAA7QjphWqC hJZSZQYd7tGy0Q2QWYVaEYNW47OYaabcvCZGgNTiEHOQP+yz2fCasiLoUssvysUTsANk RM2KYiEK8+pFScKuXSB1m4bjKtN/TT+ZPaM7rgewa8f/G85LnMYQk2/S8D4oSSroDSvl /ol54cep4tFALqC2U79qzhBQXIZfXcNhdAZYHD9uZm10TfWrw6IbQVZpBe289UQgZqSe cvaA==
X-Gm-Message-State: APjAAAWqNwhKbGQPc8vOpUYHN1q5ALrkDY6jZHgl9+DmNQx+zP0FwWW4 BQ56xg5eWpGvvoanE2ZjzRix/w==
X-Google-Smtp-Source: APXvYqxgnJOo0SLZ+kT+Rziz1i/DrVbnomcl6gMVPbVlLF2q6FHfHDjUl6Bs/ImLMlq58brlDpVjDQ==
X-Received: by 2002:a2e:8745:: with SMTP id q5mr53417376ljj.208.1578065545998; Fri, 03 Jan 2020 07:32:25 -0800 (PST)
Received: from rmbp.wire.local (h-62.96.148.44.host.de.colt.net. [62.96.148.44]) by smtp.gmail.com with ESMTPSA id a14sm24786635lfh.50.2020.01.03.07.32.24 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Jan 2020 07:32:24 -0800 (PST)
From: Raphael Robert <raphael@wire.com>
Message-Id: <4E8A7725-EBC6-4BD4-8D3A-5C52D4712D39@wire.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_48F2E490-D5AE-4BE9-8CA3-C62B5CBCC9FD"
Mime-Version: 1.0 (Mac OS X Mail 13.0 \(3608.40.2.2.4\))
Date: Fri, 3 Jan 2020 16:32:23 +0100
In-Reply-To: <CAOdM_7kJsmp-ViEjdQJUE2rxRExJmM9ODEvs77Xjh3AQrpckKw@mail.gmail.com>
Cc: Richard Barnes <rlb@ipv.sx>, Michael Rosenberg <micro@fastmail.com>, Messaging Layer Security WG <mls@ietf.org>, Benjamin Beurdouche <benjamin.beurdouche@inria.fr>, Jon Callas <jon@callas.org>
To: Kelvin Ritland <kelvinr=40google.com@dmarc.ietf.org>
References: <CAL02cgSE1xTF2Wsq-u=BCu2Z_4UzMzqMPi=D_H7_7hbRpUMVVA@mail.gmail.com> <2D195D14-9F9A-4D64-92EF-35C601C52C01@inria.fr> <CAL02cgR8gQ6cH_QXd_9v46aJ5aeo=b=1GiYu9YxCNYzJb0tOFQ@mail.gmail.com> <B36BF8F5-EAE9-4C96-A867-82CDFBF830C0@inria.fr> <CAL02cgQ7-JMQsG6sq6YBB3G-5tmCVQoo07nvW63tzBzHPQ0ZWw@mail.gmail.com> <AC74CACD-541B-49CD-9CC9-63343307A53D@fastmail.com> <CEAEFC1B-7581-4263-A45D-44E7348D7DB7@callas.org> <80ABB08C-AEE6-4E94-A972-4983E12241B6@inria.fr> <CAL02cgQjp5+C1eYztrqNW2xO+4RLjp5TER0NAe04jti5hgg2yA@mail.gmail.com> <CAL02cgSr07Kd9RR76Eb=oFLr_eBZ97LkpboOXV3ky4oBL6N5zA@mail.gmail.com> <CAOdM_7kJsmp-ViEjdQJUE2rxRExJmM9ODEvs77Xjh3AQrpckKw@mail.gmail.com>
X-Mailer: Apple Mail (2.3608.40.2.2.4)
Archived-At: <https://mailarchive.ietf.org/arch/msg/mls/3by940KdahdyFEWAIecDQc5Oi-I>
Subject: Re: [MLS] Unpredictable epochs?
X-BeenThere: mls@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Messaging Layer Security <mls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mls>, <mailto:mls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mls/>
List-Post: <mailto:mls@ietf.org>
List-Help: <mailto:mls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mls>, <mailto:mls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 03 Jan 2020 15:32:32 -0000

I agree with Kelvin’s points in general. Allowing forks might be an interesting goal, but it raises a wealth of new questions.
On the other hand I think Richard’s proposal of extending the Epoch ID with a commit hash makes sense to detect potential forks.

I’d like to propose that we adopt that as a first step, but only in the context of fork detection for now. This doesn’t preclude us from building on top of that in the future, but we also don’t introduce something dangerous right now.

Raphael

> On 3 Jan 2020, at 00:15, Kelvin Ritland <kelvinr=40google.com@dmarc.ietf.org>; wrote:
> 
> Not-pseudorandom implies some assumptions, e.g., that each sender would only send one Commit on top of each epoch
> This isn't bad imo, it reduces complexity to some extent. I'm somewhat in favour of a non-pseudorandom tiebreaker if there's no other arguments against it.
> 
> How many bits of sequence number?
> Have we considered using varints? Similar to the CTLS proposal - https://tools.ietf.org/id/draft-rescorla-tls-ctls-02.html#rfc.section.3.1 <https://tools.ietf.org/id/draft-rescorla-tls-ctls-02.html#rfc.section.3.1>
> 
> 
> Other forking considerations, I think there's more involved than just changing the epoch identifier:
> 
> 1. We'll need merges as well, since forks can lead to inconsistent group membership:
> Suppose in epoch 1 a group has A, B, and C
> A removes C in epoch 2A
> B adds D in epoch 2B
> 
> We now have two group states 2A with AB and 2B with ABCD - and it's unclear which group to use to send with. We could mandate that any client MUST send a "merge" commit on it's view of history before sending any message. This commit could just be the list of epochs that are being merged (+a usual DirectPath update?), and processing it would mean replaying all proposals up to the common ancestor for those epochs on the common ancestor.
> 
> 2. Forward secrecy issues.
> By allowing forks, devices must now keep the previous epoch around to derive possible future forks - but this means we'll be re-using init_secret_. A possible solution could be to use the nth app secret from the ASTree instead of init_secret_ as the start for the new epoch secret. This ties nicely with using a non-pseudorandom tiebreaker.
> 
> 3. Ordering implications
> It's unclear what a sensible ordering strategy is for the DS if we allow forking. (simply ensuring that each commit is based on the "current" epoch isn't well defined since there can be multiple "current" epochs).
> 
> By allowing forks the DS need not enforce ordering at all imo. This means that apps need to keep around old epochs, but this is already required per (2). A different question is how many we need to keep around - perhaps a TTL or the past N epochs after a topological sort would make sense.
> 
> Generally I think getting rid of server ordering is good idea:
> - Helps Matrix and general federation use cases
> - A common consistent mutated state for group operations can cause server side DB contention in my experience
> 
> Forks is one way to remove ordering, and if we need to support it anyways it's a nice benefit.
> 
> On Thu, Jan 2, 2020 at 12:52 PM Richard Barnes <rlb@ipv.sx>; wrote:
> Argh, hit send too soon.  Continuing below...
> 
> On Thu, Jan 2, 2020 at 3:38 PM Richard Barnes <rlb@ipv.sx>; wrote:
> Hey all,
> 
> Resurrecting this thread, as it seems that we never came to resolution on this question.
> 
> I'd like to propose we scope down and solve a narrower problem, namely the problem of enabling forks in the group's history.  So we're not going to try to provide any privacy properties, just remove the impediment to forking.  This seems like a worthwhile thing to me just because it seems silly to have forking blocked just because of syntactic constraint, in addition to it likely being necessary for some decentralized use cases (e.g., Matrix).
> 
> To allow forks, we really just need some "tie breaker" bits in the epoch ID so that if a given epoch has multiple successors, they end up with different epoch IDs.  And in order to avoid synchronization, each participant needs to be able to generate those bits independently.  There are a couple of basic questions about how to construct the tie breaker:
> 
> 1. Pseudorandom or not?
>   - Not-pseudorandom example: tiebreaker = commitSenderID
>   - Pseudorandom example: tiebreaker = H(MLSPlaintext(Commit))
>   - Not-pseudorandom implies some assumptions, e.g., that each sender would only send one Commit on top of each epoch
>   - Pseudorandom risks random collisions
>   - Possible to do both: tiebreaker = commitSenderID || H(MLSPlaintext(Commit))
> 
> 2. How many bits of tiebreaker?
>   - Non-pseudorandom size will be dictated by what we include
>   - Pseudorandom size will be dictated by tolerable collision probability
>   - Probability of collision ~ (number of forks per seq no) / 2^{number of bits}
> 
> 3. How many bits of sequence number?
> - DS might want to see sequence to help enforce ordering
> - Probably want this big enough to avoid wrapping.
> 
> (Note that this is a value that goes in every message, so there's some incentive to keep things small.)
> 
> Personally, my proposal would be something like:
> 
> struct {
>   uint64 sequence_number;
>   uint64 commit_hash; // H(MLSPlaintext(Commit))
> } EpochID;
> 
> That seems to strike a reasonable balance between low collision probability (~2^-64) and a reasonably small identifier.  If 128 bits is good enough for IPv6, it can be good enough for us :)
> 
> Does that seem workable to folks?
> 
> --Richard
> 
>  
> 
> 
> 
> 
> 
> On Fri, Apr 26, 2019 at 4:18 PM Benjamin Beurdouche <benjamin.beurdouche@inria..fr <mailto:benjamin.beurdouche@inria.fr>> wrote:
> 
> > On Apr 26, 2019, at 9:26 PM, Jon Callas <jon@callas.org <mailto:jon@callas.org>> wrote:
> > 
> >> On Apr 26, 2019, at 2:22 AM, Michael Rosenberg <micro@fastmail.com <mailto:micro@fastmail.com>> wrote:
> >> 
> >> So why not remove epoch entirely?
> > 
> > An epoch lets you deal with things happening neither too often nor not often enough. Presume there is a client that is either malicious or just stupid. You want to keep it from forcing a rekey every 100µs. You want to force a rekey every so often. Hence epochs. Yeah, picking the right epoch size is an exercise left to the reader.
> 
> You can’t use the epoch number for that as it is just global counter for group operations, we will have to keep track of the latest group operation “timestamp” for each member within the group state to check “update frequency” and handle some of the situations you described.
> 
> Btw in the TreeKEM formal spec I use a 64 bit unsigned integer, and I feel like having more than 2^32 group operations over the lifetime of a group is not unrealistic in certain extreme use cases, especially with large groups forcing PCS for application messages by triggering an update after each app message...
> 
> We could remove the epoch number if we really want but it is necessary to give the Delivery Service some ordering information (unpredictable or not is an interesting question) to handle concurrent handshake messages which is, I believe, the main current goal of that information.
> 
> Benjamin
> _______________________________________________
> MLS mailing list
> MLS@ietf.org <mailto:MLS@ietf.org>
> https://www.ietf.org/mailman/listinfo/mls <https://www.ietf.org/mailman/listinfo/mls>
> _______________________________________________
> MLS mailing list
> MLS@ietf.org
> https://www.ietf.org/mailman/listinfo/mls