Re: [MLS] Unpredictable epochs?

Kelvin Ritland <kelvinr@google.com> Thu, 02 January 2020 23:16 UTC

Return-Path: <kelvinr@google.com>
X-Original-To: mls@ietfa.amsl.com
Delivered-To: mls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 527B01207FB for <mls@ietfa.amsl.com>; Thu, 2 Jan 2020 15:16:04 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -17.499
X-Spam-Level:
X-Spam-Status: No, score=-17.499 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_MED=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, ENV_AND_HDR_SPF_MATCH=-0.5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5, USER_IN_DEF_SPF_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=google.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Jl05fD05i2WT for <mls@ietfa.amsl.com>; Thu, 2 Jan 2020 15:16:01 -0800 (PST)
Received: from mail-ua1-x933.google.com (mail-ua1-x933.google.com [IPv6:2607:f8b0:4864:20::933]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3000F120096 for <mls@ietf.org>; Thu, 2 Jan 2020 15:16:01 -0800 (PST)
Received: by mail-ua1-x933.google.com with SMTP id c7so11158059uaf.5 for <mls@ietf.org>; Thu, 02 Jan 2020 15:16:01 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=bNYwrOckHYX++oUalrmqDi9oIZIA7i/p7xzJrsnZzDY=; b=iuQr1GBfPDUFwa92CZE2oVKs0fUVpHCuuPvXIgFj6cbeTmiz7qe/rCW06yEncAoIuB 1qU5cU8MQ953MQumIn1DLpzj+YKo4wdei5c0tgZKG4RL7lnzZDw4EFQJath3GeZtLd52 sEb3RjoKBxxIZyq25TCbmyZ3PwKj3NTX+5uh721yKObVMIVrFY8Xc6M3UEiJgsfev/tb O4XAge8uPRNpSBm8azDIcMPrxuycjpQxU9qJHuFDViLZ/DlBk9YYQiObuta2kbgEH2PJ TuTIT6J0H01RLkz7yHfJuc0BvrtWmc6ic2lAmtCBMMyGphed7ryFjFVzXB0cckmf+Geb cBJQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bNYwrOckHYX++oUalrmqDi9oIZIA7i/p7xzJrsnZzDY=; b=EVLjl3z8sQgsK5k2kb8920uGGXBgfUeKoO0JvlOQ7NVlNizzLvMoPlz/ZXRGZhY1HT tLOLPuyChbTXYBwCwpvjxpTdWVQLg27J/4gf5ZyPAb53y+qaN+VZE2t9MADuFY2o75Nm qhBwyGiSGdmQsyMGDLRd51PbDbL082X/6j5c86GScBgzXfG3tkLOyGa78+mrytVvcUHg GTwMXv6PK+l1QARdY71M35E5ZJutwEC/m1TD/A+egSGUn0QRv1L/u7TR+Yto5aW/dXl7 17PeLq2VjxRfD6rluSO0IYUHqZmtK/NCMIOBRfZfymWwXLhITi0NNkZWto/oSgeshQA0 6seg==
X-Gm-Message-State: APjAAAXI2YFeBjUr4pDjoOu3C07MrxcV8M/cXxo7THN8B/b14G4yXLGw AilqgvPPwaAZsLEU0sJpD11rAPtKEthxb8pUIWBYqA==
X-Google-Smtp-Source: APXvYqy61EASS/vQpJaChGTpeY0M2jKRzU63CQS+IBdvPheQqt9aqEPL+8iWEnyhrKkKFdOEsBpjzk2YTuh/r7tvm6c=
X-Received: by 2002:ab0:6505:: with SMTP id w5mr22190924uam.125.1578006960030; Thu, 02 Jan 2020 15:16:00 -0800 (PST)
MIME-Version: 1.0
References: <CAL02cgSE1xTF2Wsq-u=BCu2Z_4UzMzqMPi=D_H7_7hbRpUMVVA@mail.gmail.com> <2D195D14-9F9A-4D64-92EF-35C601C52C01@inria.fr> <CAL02cgR8gQ6cH_QXd_9v46aJ5aeo=b=1GiYu9YxCNYzJb0tOFQ@mail.gmail.com> <B36BF8F5-EAE9-4C96-A867-82CDFBF830C0@inria.fr> <CAL02cgQ7-JMQsG6sq6YBB3G-5tmCVQoo07nvW63tzBzHPQ0ZWw@mail.gmail.com> <AC74CACD-541B-49CD-9CC9-63343307A53D@fastmail.com> <CEAEFC1B-7581-4263-A45D-44E7348D7DB7@callas.org> <80ABB08C-AEE6-4E94-A972-4983E12241B6@inria.fr> <CAL02cgQjp5+C1eYztrqNW2xO+4RLjp5TER0NAe04jti5hgg2yA@mail.gmail.com> <CAL02cgSr07Kd9RR76Eb=oFLr_eBZ97LkpboOXV3ky4oBL6N5zA@mail.gmail.com>
In-Reply-To: <CAL02cgSr07Kd9RR76Eb=oFLr_eBZ97LkpboOXV3ky4oBL6N5zA@mail.gmail.com>
From: Kelvin Ritland <kelvinr@google.com>
Date: Thu, 2 Jan 2020 15:15:48 -0800
Message-ID: <CAOdM_7kJsmp-ViEjdQJUE2rxRExJmM9ODEvs77Xjh3AQrpckKw@mail.gmail.com>
To: Richard Barnes <rlb@ipv.sx>
Cc: Benjamin Beurdouche <benjamin.beurdouche@inria.fr>, Michael Rosenberg <micro@fastmail.com>, Messaging Layer Security WG <mls@ietf.org>, Jon Callas <jon@callas.org>
Content-Type: multipart/alternative; boundary="0000000000004ea4ca059b30604a"
Archived-At: <https://mailarchive.ietf.org/arch/msg/mls/A4mVaXT-rqwZTNYXrC1TDb6XC_o>
Subject: Re: [MLS] Unpredictable epochs?
X-BeenThere: mls@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Messaging Layer Security <mls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mls>, <mailto:mls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mls/>
List-Post: <mailto:mls@ietf.org>
List-Help: <mailto:mls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mls>, <mailto:mls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 02 Jan 2020 23:16:08 -0000

>
> Not-pseudorandom implies some assumptions, e.g., that each sender would
> only send one Commit on top of each epoch

This isn't bad imo, it reduces complexity to some extent. I'm somewhat in
favour of a non-pseudorandom tiebreaker if there's no other arguments
against it.

How many bits of sequence number?

Have we considered using varints? Similar to the CTLS proposal -
https://tools.ietf.org/id/draft-rescorla-tls-ctls-02.html#rfc.section.3.1


Other forking considerations, I think there's more involved than just
changing the epoch identifier:

1. We'll need merges as well, since forks can lead to inconsistent group
membership:
Suppose in epoch 1 a group has A, B, and C
A removes C in epoch 2A
B adds D in epoch 2B

We now have two group states 2A with AB and 2B with ABCD - and it's unclear
which group to use to send with. We could mandate that any client MUST send
a "merge" commit on it's view of history before sending any message. This
commit could just be the list of epochs that are being merged (+a usual
DirectPath update?), and processing it would mean replaying all proposals
up to the common ancestor for those epochs on the common ancestor.

2. Forward secrecy issues.
By allowing forks, devices must now keep the previous epoch around to
derive possible future forks - but this means we'll be re-using
init_secret_. A possible solution could be to use the nth app secret from
the ASTree instead of init_secret_ as the start for the new epoch secret.
This ties nicely with using a non-pseudorandom tiebreaker.

3. Ordering implications
It's unclear what a sensible ordering strategy is for the DS if we allow
forking. (simply ensuring that each commit is based on the "current" epoch
isn't well defined since there can be multiple "current" epochs).

By allowing forks the DS need not enforce ordering at all imo. This means
that apps need to keep around old epochs, but this is already required per
(2). A different question is how many we need to keep around - perhaps a
TTL or the past N epochs after a topological sort would make sense.

Generally I think getting rid of server ordering is good idea:
- Helps Matrix and general federation use cases
- A common consistent mutated state for group operations can cause server
side DB contention in my experience

Forks is one way to remove ordering, and if we need to support it anyways
it's a nice benefit.

On Thu, Jan 2, 2020 at 12:52 PM Richard Barnes <rlb@ipv.sx>; wrote:

> Argh, hit send too soon.  Continuing below...
>
> On Thu, Jan 2, 2020 at 3:38 PM Richard Barnes <rlb@ipv.sx>; wrote:
>
>> Hey all,
>>
>> Resurrecting this thread, as it seems that we never came to resolution on
>> this question.
>>
>> I'd like to propose we scope down and solve a narrower problem, namely
>> the problem of enabling forks in the group's history.  So we're not going
>> to try to provide any privacy properties, just remove the impediment to
>> forking.  This seems like a worthwhile thing to me just because it seems
>> silly to have forking blocked just because of syntactic constraint, in
>> addition to it likely being necessary for some decentralized use cases
>> (e.g., Matrix).
>>
>> To allow forks, we really just need some "tie breaker" bits in the epoch
>> ID so that if a given epoch has multiple successors, they end up with
>> different epoch IDs.  And in order to avoid synchronization, each
>> participant needs to be able to generate those bits independently.  There
>> are a couple of basic questions about how to construct the tie breaker:
>>
>> 1. Pseudorandom or not?
>>   - Not-pseudorandom example: tiebreaker = commitSenderID
>>   - Pseudorandom example: tiebreaker = H(MLSPlaintext(Commit))
>>   - Not-pseudorandom implies some assumptions, e.g., that each sender
>> would only send one Commit on top of each epoch
>>   - Pseudorandom risks random collisions
>>   - Possible to do both: tiebreaker = commitSenderID ||
>> H(MLSPlaintext(Commit))
>>
>> 2. How many bits of tiebreaker?
>>
>   - Non-pseudorandom size will be dictated by what we include
>   - Pseudorandom size will be dictated by tolerable collision probability
>   - Probability of collision ~ (number of forks per seq no) / 2^{number of
> bits}
>
> 3. How many bits of sequence number?
> - DS might want to see sequence to help enforce ordering
> - Probably want this big enough to avoid wrapping.
>
> (Note that this is a value that goes in every message, so there's some
> incentive to keep things small.)
>
> Personally, my proposal would be something like:
>
> struct {
>   uint64 sequence_number;
>   uint64 commit_hash; // H(MLSPlaintext(Commit))
> } EpochID;
>
> That seems to strike a reasonable balance between low collision
> probability (~2^-64) and a reasonably small identifier.  If 128 bits is
> good enough for IPv6, it can be good enough for us :)
>
> Does that seem workable to folks?
>
> --Richard
>
>
>
>>
>>
>>
>>
>>
>> On Fri, Apr 26, 2019 at 4:18 PM Benjamin Beurdouche <
>> benjamin.beurdouche@inria.fr>; wrote:
>>
>>>
>>> > On Apr 26, 2019, at 9:26 PM, Jon Callas <jon@callas.org>; wrote:
>>> >
>>> >> On Apr 26, 2019, at 2:22 AM, Michael Rosenberg <micro@fastmail.com>;
>>> wrote:
>>> >>
>>> >> So why not remove epoch entirely?
>>> >
>>> > An epoch lets you deal with things happening neither too often nor not
>>> often enough. Presume there is a client that is either malicious or just
>>> stupid. You want to keep it from forcing a rekey every 100µs. You want to
>>> force a rekey every so often. Hence epochs. Yeah, picking the right epoch
>>> size is an exercise left to the reader.
>>>
>>> You can’t use the epoch number for that as it is just global counter for
>>> group operations, we will have to keep track of the latest group operation
>>> “timestamp” for each member within the group state to check “update
>>> frequency” and handle some of the situations you described.
>>>
>>> Btw in the TreeKEM formal spec I use a 64 bit unsigned integer, and I
>>> feel like having more than 2^32 group operations over the lifetime of a
>>> group is not unrealistic in certain extreme use cases, especially with
>>> large groups forcing PCS for application messages by triggering an update
>>> after each app message...
>>>
>>> We could remove the epoch number if we really want but it is necessary
>>> to give the Delivery Service some ordering information (unpredictable or
>>> not is an interesting question) to handle concurrent handshake messages
>>> which is, I believe, the main current goal of that information.
>>>
>>> Benjamin
>>>
>> _______________________________________________
> MLS mailing list
> MLS@ietf.org
> https://www.ietf.org/mailman/listinfo/mls
>