Re: [MLS] Hiding content type

On Sat, Jul 25, 2020 at 6:10 PM Brendan McMillion <brendan@cloudflare.com>
wrote:

> I’m concerned about adversarial collisions, that'd cause a partial DoS
> like you say. There are other ways to cause partial DoS in an MLS group,
> but I think it’s good to minimize the avenues for attack.
>

I wonder if this could in part be addressed just by using a longer
identifier to make finding adversarial collisions impractical.

> In your example, it sounds like you’re trying to achieve unlinkability
> between messages, with a Generic Commit Sequencing Service which is
> basically a KV store that only accepts one write per key. So in this
> scenario, I think you would be much better off by simply truncating the
> group ID and epoch from MLSCiphertext (making them implicit), and choosing
> the key that each message is stored under with some function of: 1.) a
> group secret, and 2.) a message counter.
>
> Would that work?
>

Interesting, but I don't think so.  I'm concerned about forcing
applications to invent too much here.  It seems like from what we know now,
some applications are going to need to identify the group ID / epoch in the
message, and in some applications, the group ID / epoch is going to be
clear from context.  That latter class of application would probably be OK
with an implicit GroupID / Epoch.  But in the former class, you would be
forcing the application to invent its own scheme for signaling this stuff,
which risks them making decisions that are suboptimal for metadata privacy.

So my preference would still be to provide an identifier that offers the
minimal properties required to make the protocol work.  But I would be
happy to acknowledge that applications can leak more metadata if they need
to, and that metadata can be ultimately authenticated by the protocol.  In
fact, we could make this explicit by making the AAD for the ciphertext
encryption include the GroupID and Epoch counter in addition to the opaque
epoch.

What would folks think about the following proposal:

1. MLSCiphertext includes an opaque EpochID
2. ... with a length that makes finding collisions infeasible (16 bytes?
32 bytes?  KDF.Nh bytes?)
3. We add some explicit text noting that additional metadata can be added
by the application
4. ... and update the AAD for the sender data and content encryptions to
use all three values (group ID, epoch counter, epoch ID)

--Richard

> On Jul 25, 2020, at 12:38 PM, Richard Barnes <rlb@ipv.sx> wrote:
>
> Just so we're clear, in your example, are you concerned about accidental
> or adversarial collisions?  I hope we agree that we can fix the accidental
> case just by making a longer epoch ID.  In the adversarial case, this seems
> like a partial DoS at worst, since the members who incorrectly think they
> are in the previous epoch will no longer be decrypting with the right keys.
>
> As far as a positive example: Imagine a Generic Commit Sequencing Service
> that provides a reliable broadcast/ledger over anonymous channels (e.g., as
> an onion service), but will only broadcast/record one message for each
> groupID/epoch.  If the groupID/epoch is opaque, then such a service can be
> provided without learning anything about the groups it serves or their
> evolution.
>
>
> On Sat, Jul 25, 2020 at 1:50 PM Brendan McMillion <brendan@cloudflare.com>
> wrote:
>
>> Take for example, a broadcast channel that's ordered but lossy. A Commit
>> gets sent where the new epoch id collides with the previous epoch id, but
>> not all members get the Commit. So now the group is fractured and there's
>> no way for members in the previous epoch to know that they missed a
>> message. What the application developer needs to do to detect lost
>> messages, is immediately re-implement the counter epoch id that you've just
>> removed.
>>
>> That's an example of a system where this change is pointless / harmful.
>> If you could provide an example of a system where the change is *helpful*,
>> that would be more interesting.
>>
>> On Fri, Jul 24, 2020 at 11:05 AM Richard Barnes <rlb@ipv.sx> wrote:
>>
>>> I'm honestly really confused by the worry about collisions here.  The
>>> current epoch ID collides trivially -- it's just a counter!  And the group
>>> ID is assumed to be public anyway.  So if your authentication scheme is
>>> broken when you have a (group ID, epoch ID) collision, then your scheme is
>>> already broken.
>>>
>>> If we're going to block this change on this basis, then we need to see
>>> clearly articulated an authentication scheme that is secure in the current
>>> model, but broken in the case where epoch IDs are derived off of the key
>>> schedule.
>>>
>>> --Richard
>>>
>>> On Tue, Jul 14, 2020 at 10:38 AM Hale, Britta (CIV) <britta.hale@nps.edu>
>>> wrote:
>>>
>>>> Ensuring the most conservative design by default seems advisable;
>>>> however it is not clear that this proposal for hiding content type is
>>>> actually doing that. Indeed, if an application is concerned about hiding
>>>> the content type, would not that same type of application also be concerned
>>>> about collisions?
>>>>
>>>>
>>>>
>>>> As Brendan notes, we are not talking about accidental collisions here.
>>>> Pre-computation is entirely possible, such as by a malicious group member,
>>>> so the window of attack is not limited to one epoch.
>>>>
>>>>
>>>>
>>>> If this is this PR particular is of particular interest to some, then
>>>> it would be good to see a clarifying explanation as to the security it
>>>> achieves (i.e. why this is “security by design” despite introducing such
>>>> collision possibilities), or a concrete use-case. Alternatively, are we
>>>> really limited to a 64-bit field, or can that length be adjusted to
>>>> mitigate the introduced problems?
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> A clarifying point for those following this thread: some references in
>>>> the email chain state that this is encryption of the epoch ID. That is not
>>>> the case. The proposal being discussed is about a hash thereof (hence
>>>> discussion on collisions).
>>>>
>>>>
>>>>
>>>> Britta
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> *From: *MLS <mls-bounces@ietf.org> on behalf of Brendan McMillion
>>>> <brendan=40cloudflare.com@dmarc.ietf.org>
>>>> *Date: *Monday, July 13, 2020 at 9:37 AM
>>>> *To: *Richard Barnes <rlb@ipv.sx>
>>>> *Cc: *Messaging Layer Security WG <mls@ietf.org>
>>>> *Subject: *Re: [MLS] Hiding content type
>>>>
>>>>
>>>>
>>>> As far as collision resistance, I'm not too worried about collisions
>>>> among 64-bit random values, especially as the scope for collision is fairly
>>>> small, arguably just one epoch.
>>>>
>>>>
>>>>
>>>> The issue isn't with accidental collisions, it's with malicious ones.
>>>> An attacker can purposefully (and quickly) generate commits that create
>>>> duplicate epoch ids, and send them to a client to corrupt their group state.
>>>>
>>>>
>>>>
>>>> I think the idea here is to be conservative by default.  There are
>>>> systems in which you can hide the metadata you're talking about with things
>>>> like mixnets and dropboxes.
>>>>
>>>>
>>>>
>>>> It's not just being more conservative, you're making changes
>>>> specifically for very niche use-cases where I don't think the security of
>>>> the whole system has been thought through. Maybe you have an example in
>>>> mind, but I can't think of a system where this change provides any
>>>> additional privacy.
>>>>
>>>>
>>>>
>>>> On Mon, Jul 13, 2020 at 10:56 AM Richard Barnes <rlb@ipv.sx> wrote:
>>>>
>>>> On Mon, Jul 13, 2020 at 10:57 AM Brendan McMillion <
>>>> brendan@cloudflare.com> wrote:
>>>>
>>>> With respect to using an opaque epoch id, I believe this was proposed
>>>> in #245 <https://github.com/mlswg/mls-protocol/pull/245> and consensus
>>>> was against it because it makes it more difficult to implement the DS. It's
>>>> also not clear to me what security properties you want from an opaque epoch
>>>> id, because the current PR doesn't provide collision resistance and I'd
>>>> expect this to be a source of confusion and bugs.
>>>>
>>>>
>>>>
>>>> I think the discussion on #245 got derailed a bit by the focus on
>>>> non-linear epochs.  The point here is that even in a case where you have
>>>> linear history, I think there's intrinsic value in the epoch ID being
>>>> opaque because it gives intermediaries less opportunity for ossification.
>>>>
>>>>
>>>>
>>>> As far as collision resistance, I'm not too worried about collisions
>>>> among 64-bit random values, especially as the scope for collision is fairly
>>>> small, arguably just one epoch.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> With respect to encrypting the content type, it also seems to me that
>>>> this would cause issues because different content types have different
>>>> delivery guarantees. Specifically: messages are allowed to be unordered and
>>>> lossy, proposals are allowed to be unordered but not lossy, and commits are
>>>> both ordered and not lossy. In the deployment scenarios we've talked about,
>>>> the DS essentially always needs to know the content type to provide this.
>>>> Conversely, you don't get much additional privacy from encrypting the
>>>> content type because an eavesdropper can see the epoch id change and infer
>>>> which message was a commit, or see a dropped message getting re-sent and
>>>> infer it was a proposal.
>>>>
>>>>
>>>>
>>>> I don't disagree that there are different guarantees you need, but like
>>>> I said, I think the idea here is to be conservative by default.  There are
>>>> systems in which you can hide the metadata you're talking about with things
>>>> like mixnets and dropboxes.  Keeping the ciphertext as opaque as possible
>>>> keeps the door open to running MLS over those systems without MLS's own
>>>> metadata undermining their privacy properties.
>>>>
>>>>
>>>>
>>>> FWIW, in the systems I'm looking at building, this PR does not make any
>>>> difference, because clients use separate channels for Proposals/Commits vs.
>>>> application data and the server doesn't check.
>>>>
>>>>
>>>>
>>>> --Richard
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Mon, Jul 13, 2020 at 8:56 AM Richard Barnes <rlb@ipv.sx> wrote:
>>>>
>>>> Hi all,
>>>>
>>>>
>>>>
>>>> Recall that PR#349 does two things (1) use an opaque epoch ID, and (2)
>>>> encrypt the content type of the message (application / proposal / commit)
>>>> [1].
>>>>
>>>>
>>>>
>>>> There was some discussion on the call last week that encrypting the
>>>> content type might not be that useful, since an application that relies on
>>>> a server to assure ordering of commits will need to at least be able to
>>>> tell commits from other things.  Of course, even if the content type is
>>>> encrypted by default, the application can add it back in authenticated
>>>> data, or in some unauthenticated wrapper.
>>>>
>>>>
>>>>
>>>> Net of those considerations, I'm personally still inclined to merge the
>>>> PR, to have a conservative baseline.  But those on the call thought it
>>>> would be useful to pose this to the group, so please speak up if you have
>>>> concerns.
>>>>
>>>>
>>>>
>>>> --Richard
>>>>
>>>>
>>>>
>>>> [1] https://github.com/mlswg/mls-protocol/pull/349
>>>>
>>>> _______________________________________________
>>>> MLS mailing list
>>>> MLS@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/mls
>>>>
>>>>
>