Re: [MLS] Hiding content type

Brendan McMillion <brendan@cloudflare.com> Tue, 28 July 2020 14:11 UTC

Return-Path: <brendan@cloudflare.com>
X-Original-To: mls@ietfa.amsl.com
Delivered-To: mls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2607C3A0CB2 for <mls@ietfa.amsl.com>; Tue, 28 Jul 2020 07:11:06 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.099
X-Spam-Level:
X-Spam-Status: No, score=-2.099 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cloudflare.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gG1pXnc4SLup for <mls@ietfa.amsl.com>; Tue, 28 Jul 2020 07:11:01 -0700 (PDT)
Received: from mail-qt1-x830.google.com (mail-qt1-x830.google.com [IPv6:2607:f8b0:4864:20::830]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 86EF13A0CB1 for <mls@ietf.org>; Tue, 28 Jul 2020 07:11:01 -0700 (PDT)
Received: by mail-qt1-x830.google.com with SMTP id k18so14867598qtm.10 for <mls@ietf.org>; Tue, 28 Jul 2020 07:11:01 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cloudflare.com; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=unq92KTyHqA6aFh4PzFHxGYCIlWHdrYmFWUklCVYZwE=; b=J76wNtyK3+G88FAnlWHyC9r1JCzzikksbbbofoj6f9sWUCPlQvThtp73XqXPZuPZNY Kw2Iftwhw4Y+8wwMZWjkUlZhOyXvj7o7e6xEJMzBoFionLar97pzqyjPSrKvRvc+I9M+ 2HsQE9XvCWkHEzvyc+yR3g5jBonnqXePHLKsQ=
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=unq92KTyHqA6aFh4PzFHxGYCIlWHdrYmFWUklCVYZwE=; b=mj0UGlhuKhA0xbGPY1nsMP2kq49UBc6zoab18LnC3uk9oWkY+ZRBywZaAkeL7sX83u GOStgi/KX2otsgTQ8J69uBhaljzHY5vykOsThjg475xpCve8LHBFwdkXbIQ3kqGMFkAi OnP3oSBs0ekaanZIP7RhML/OY79naa1czOTdubdfjkovyqTX6jTOMvEhZv/ovBLYjznz 8ORHR7HFqajIl3izm90+OWUtNpXKPgbfsP9tXVL6LPUOq3oWnxYpwLCK6LMkfizsVl/I Gfk8xsijS1ob3ZHwudTTncxcJxmeY3S71nzPJfnWNqcFx+qLt+33Vn5vi1XWSEa0JPDZ TKPg==
X-Gm-Message-State: AOAM532HXLrz5vRUyI/NrhpKC9lk4PZzEfV/lKzkO6YpsQx2OKzqJ1x+ I5pgkY+HnjV3q/DFd8D/eXHUvSQdl9LFHv/B98LRlDyWXdOUyA==
X-Google-Smtp-Source: ABdhPJzIk/bfynqAC3ZaZxB8wDO5JUN4kYg7LiAgBD771ScfOGsregM7M05AC/ZNYbjmsJeXLPlKjKmwSVt4rN2DwL0=
X-Received: by 2002:ac8:6d2f:: with SMTP id r15mr27321832qtu.281.1595945460278; Tue, 28 Jul 2020 07:11:00 -0700 (PDT)
MIME-Version: 1.0
References: <CAL02cgT4jBiJNCoRBsBc7hRWBX0qzmZjC8B8XmJnGcZXgEiCdg@mail.gmail.com> <CABP-pSTW=7jK2hRHLYwydOyrfimfqti0Rih=BoBpqBJDfqf4QQ@mail.gmail.com> <CAL02cgT0KWJ21m70q5cL7NKBB+-Cjvb63YpgnBGoisScNqQchQ@mail.gmail.com> <CABP-pSR2UxWkKk6a_T9vN4cv89zCchNbN4YN=dS_qm5Ye4AjJA@mail.gmail.com> <FCAAD638-E0F8-4A91-90F1-1A2F1233D88F@nps.edu> <CAL02cgQBge9V3YEtnxRPOxLuMWQzg_Y1XD_cmEX=q94ou6fe7w@mail.gmail.com> <CABP-pSSQJ2dT2-mmvAYFYhvtVsRL9KWy71Zcfoxx8pMse2L=Kw@mail.gmail.com> <CAL02cgRxXjoCQX3V7TTL9aW04ywFmwQvk3vRcMadQfpATfXwHg@mail.gmail.com> <56633046-DB4C-4EE8-A83B-E88A4F430171@cloudflare.com> <CAL02cgS4n_Y5=BiWfqN20PjyCwA1uubEEnh46ExDuYK=NMdZHA@mail.gmail.com>
In-Reply-To: <CAL02cgS4n_Y5=BiWfqN20PjyCwA1uubEEnh46ExDuYK=NMdZHA@mail.gmail.com>
From: Brendan McMillion <brendan@cloudflare.com>
Date: Tue, 28 Jul 2020 07:10:48 -0700
Message-ID: <CABP-pSRAP-9SZMgWrdD15SFJn5y7ya9yBHvtEnP+6KBbWtqHdQ@mail.gmail.com>
To: Richard Barnes <rlb@ipv.sx>
Cc: "Hale, Britta (CIV)" <britta.hale@nps.edu>, Messaging Layer Security WG <mls@ietf.org>
Content-Type: multipart/alternative; boundary="0000000000003dd4e905ab8102d1"
Archived-At: <https://mailarchive.ietf.org/arch/msg/mls/Hl4QyL37kaAvfwo7-3M99jBVOBw>
Subject: Re: [MLS] Hiding content type
X-BeenThere: mls@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Messaging Layer Security <mls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mls>, <mailto:mls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mls/>
List-Post: <mailto:mls@ietf.org>
List-Help: <mailto:mls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mls>, <mailto:mls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Jul 2020 14:11:06 -0000

>
> But in the former class, you would be forcing the application to invent
> its own scheme for signaling this stuff, which risks them making decisions
> that are suboptimal for metadata privacy.


Making a metadata-resistant deployment of MLS would already be incredibly
difficult because you'd have to design your application to use the network
in a way that prevents useful traffic analysis. Someone who's capable of
this should definitely be capable of choosing how messages are identified
in their system.

The flip side, is that your construction is already suboptimal for metadata
privacy because the epoch id only changes once per epoch. In the
construction that I described, the opaque identifier changes with each
message. You could also imagine a system where the identifier would change
with time instead of with protocol messages (for example, where messages
are distributed over Tor circuits).

On Tue, Jul 28, 2020 at 5:35 AM Richard Barnes <rlb@ipv.sx> wrote:

> On Sat, Jul 25, 2020 at 6:10 PM Brendan McMillion <brendan@cloudflare.com>
> wrote:
>
>> I’m concerned about adversarial collisions, that'd cause a partial DoS
>> like you say. There are other ways to cause partial DoS in an MLS group,
>> but I think it’s good to minimize the avenues for attack.
>>
>
> I wonder if this could in part be addressed just by using a longer
> identifier to make finding adversarial collisions impractical.
>
>
>> In your example, it sounds like you’re trying to achieve unlinkability
>> between messages, with a Generic Commit Sequencing Service which is
>> basically a KV store that only accepts one write per key. So in this
>> scenario, I think you would be much better off by simply truncating the
>> group ID and epoch from MLSCiphertext (making them implicit), and choosing
>> the key that each message is stored under with some function of: 1.) a
>> group secret, and 2.) a message counter.
>>
>> Would that work?
>>
>
> Interesting, but I don't think so.  I'm concerned about forcing
> applications to invent too much here.  It seems like from what we know now,
> some applications are going to need to identify the group ID / epoch in the
> message, and in some applications, the group ID / epoch is going to be
> clear from context.  That latter class of application would probably be OK
> with an implicit GroupID / Epoch.  But in the former class, you would be
> forcing the application to invent its own scheme for signaling this stuff,
> which risks them making decisions that are suboptimal for metadata privacy.
>
> So my preference would still be to provide an identifier that offers the
> minimal properties required to make the protocol work.  But I would be
> happy to acknowledge that applications can leak more metadata if they need
> to, and that metadata can be ultimately authenticated by the protocol.  In
> fact, we could make this explicit by making the AAD for the ciphertext
> encryption include the GroupID and Epoch counter in addition to the opaque
> epoch.
>
> What would folks think about the following proposal:
>
> 1. MLSCiphertext includes an opaque EpochID
> 2. ... with a length that makes finding collisions infeasible (16 bytes?
> 32 bytes?  KDF.Nh bytes?)
> 3. We add some explicit text noting that additional metadata can be added
> by the application
> 4. ... and update the AAD for the sender data and content encryptions to
> use all three values (group ID, epoch counter, epoch ID)
>
> --Richard
>
>
>> On Jul 25, 2020, at 12:38 PM, Richard Barnes <rlb@ipv.sx> wrote:
>>
>> Just so we're clear, in your example, are you concerned about accidental
>> or adversarial collisions?  I hope we agree that we can fix the accidental
>> case just by making a longer epoch ID.  In the adversarial case, this seems
>> like a partial DoS at worst, since the members who incorrectly think they
>> are in the previous epoch will no longer be decrypting with the right keys.
>>
>> As far as a positive example: Imagine a Generic Commit Sequencing Service
>> that provides a reliable broadcast/ledger over anonymous channels (e.g., as
>> an onion service), but will only broadcast/record one message for each
>> groupID/epoch.  If the groupID/epoch is opaque, then such a service can be
>> provided without learning anything about the groups it serves or their
>> evolution.
>>
>>
>> On Sat, Jul 25, 2020 at 1:50 PM Brendan McMillion <brendan@cloudflare.com>
>> wrote:
>>
>>> Take for example, a broadcast channel that's ordered but lossy. A Commit
>>> gets sent where the new epoch id collides with the previous epoch id, but
>>> not all members get the Commit. So now the group is fractured and there's
>>> no way for members in the previous epoch to know that they missed a
>>> message. What the application developer needs to do to detect lost
>>> messages, is immediately re-implement the counter epoch id that you've just
>>> removed.
>>>
>>> That's an example of a system where this change is pointless / harmful.
>>> If you could provide an example of a system where the change is *helpful*,
>>> that would be more interesting.
>>>
>>> On Fri, Jul 24, 2020 at 11:05 AM Richard Barnes <rlb@ipv.sx> wrote:
>>>
>>>> I'm honestly really confused by the worry about collisions here.  The
>>>> current epoch ID collides trivially -- it's just a counter!  And the group
>>>> ID is assumed to be public anyway.  So if your authentication scheme is
>>>> broken when you have a (group ID, epoch ID) collision, then your scheme is
>>>> already broken.
>>>>
>>>> If we're going to block this change on this basis, then we need to see
>>>> clearly articulated an authentication scheme that is secure in the current
>>>> model, but broken in the case where epoch IDs are derived off of the key
>>>> schedule.
>>>>
>>>> --Richard
>>>>
>>>> On Tue, Jul 14, 2020 at 10:38 AM Hale, Britta (CIV) <
>>>> britta.hale@nps.edu> wrote:
>>>>
>>>>> Ensuring the most conservative design by default seems advisable;
>>>>> however it is not clear that this proposal for hiding content type is
>>>>> actually doing that. Indeed, if an application is concerned about hiding
>>>>> the content type, would not that same type of application also be concerned
>>>>> about collisions?
>>>>>
>>>>>
>>>>>
>>>>> As Brendan notes, we are not talking about accidental collisions here.
>>>>> Pre-computation is entirely possible, such as by a malicious group member,
>>>>> so the window of attack is not limited to one epoch.
>>>>>
>>>>>
>>>>>
>>>>> If this is this PR particular is of particular interest to some, then
>>>>> it would be good to see a clarifying explanation as to the security it
>>>>> achieves (i.e. why this is “security by design” despite introducing such
>>>>> collision possibilities), or a concrete use-case. Alternatively, are we
>>>>> really limited to a 64-bit field, or can that length be adjusted to
>>>>> mitigate the introduced problems?
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> A clarifying point for those following this thread: some references in
>>>>> the email chain state that this is encryption of the epoch ID. That is not
>>>>> the case. The proposal being discussed is about a hash thereof (hence
>>>>> discussion on collisions).
>>>>>
>>>>>
>>>>>
>>>>> Britta
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *From: *MLS <mls-bounces@ietf.org> on behalf of Brendan McMillion
>>>>> <brendan=40cloudflare.com@dmarc.ietf.org>
>>>>> *Date: *Monday, July 13, 2020 at 9:37 AM
>>>>> *To: *Richard Barnes <rlb@ipv.sx>
>>>>> *Cc: *Messaging Layer Security WG <mls@ietf.org>
>>>>> *Subject: *Re: [MLS] Hiding content type
>>>>>
>>>>>
>>>>>
>>>>> As far as collision resistance, I'm not too worried about collisions
>>>>> among 64-bit random values, especially as the scope for collision is fairly
>>>>> small, arguably just one epoch.
>>>>>
>>>>>
>>>>>
>>>>> The issue isn't with accidental collisions, it's with malicious ones.
>>>>> An attacker can purposefully (and quickly) generate commits that create
>>>>> duplicate epoch ids, and send them to a client to corrupt their group state.
>>>>>
>>>>>
>>>>>
>>>>> I think the idea here is to be conservative by default.  There are
>>>>> systems in which you can hide the metadata you're talking about with things
>>>>> like mixnets and dropboxes.
>>>>>
>>>>>
>>>>>
>>>>> It's not just being more conservative, you're making changes
>>>>> specifically for very niche use-cases where I don't think the security of
>>>>> the whole system has been thought through. Maybe you have an example in
>>>>> mind, but I can't think of a system where this change provides any
>>>>> additional privacy.
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jul 13, 2020 at 10:56 AM Richard Barnes <rlb@ipv.sx> wrote:
>>>>>
>>>>> On Mon, Jul 13, 2020 at 10:57 AM Brendan McMillion <
>>>>> brendan@cloudflare.com> wrote:
>>>>>
>>>>> With respect to using an opaque epoch id, I believe this was proposed
>>>>> in #245 <https://github.com/mlswg/mls-protocol/pull/245> and
>>>>> consensus was against it because it makes it more difficult to implement
>>>>> the DS. It's also not clear to me what security properties you want from an
>>>>> opaque epoch id, because the current PR doesn't provide collision
>>>>> resistance and I'd expect this to be a source of confusion and bugs.
>>>>>
>>>>>
>>>>>
>>>>> I think the discussion on #245 got derailed a bit by the focus on
>>>>> non-linear epochs.  The point here is that even in a case where you have
>>>>> linear history, I think there's intrinsic value in the epoch ID being
>>>>> opaque because it gives intermediaries less opportunity for ossification.
>>>>>
>>>>>
>>>>>
>>>>> As far as collision resistance, I'm not too worried about collisions
>>>>> among 64-bit random values, especially as the scope for collision is fairly
>>>>> small, arguably just one epoch.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> With respect to encrypting the content type, it also seems to me that
>>>>> this would cause issues because different content types have different
>>>>> delivery guarantees. Specifically: messages are allowed to be unordered and
>>>>> lossy, proposals are allowed to be unordered but not lossy, and commits are
>>>>> both ordered and not lossy. In the deployment scenarios we've talked about,
>>>>> the DS essentially always needs to know the content type to provide this.
>>>>> Conversely, you don't get much additional privacy from encrypting the
>>>>> content type because an eavesdropper can see the epoch id change and infer
>>>>> which message was a commit, or see a dropped message getting re-sent and
>>>>> infer it was a proposal.
>>>>>
>>>>>
>>>>>
>>>>> I don't disagree that there are different guarantees you need, but
>>>>> like I said, I think the idea here is to be conservative by default.  There
>>>>> are systems in which you can hide the metadata you're talking about with
>>>>> things like mixnets and dropboxes.  Keeping the ciphertext as opaque as
>>>>> possible keeps the door open to running MLS over those systems without
>>>>> MLS's own metadata undermining their privacy properties.
>>>>>
>>>>>
>>>>>
>>>>> FWIW, in the systems I'm looking at building, this PR does not make
>>>>> any difference, because clients use separate channels for Proposals/Commits
>>>>> vs. application data and the server doesn't check.
>>>>>
>>>>>
>>>>>
>>>>> --Richard
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Mon, Jul 13, 2020 at 8:56 AM Richard Barnes <rlb@ipv.sx> wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>>
>>>>>
>>>>> Recall that PR#349 does two things (1) use an opaque epoch ID, and (2)
>>>>> encrypt the content type of the message (application / proposal / commit)
>>>>> [1].
>>>>>
>>>>>
>>>>>
>>>>> There was some discussion on the call last week that encrypting the
>>>>> content type might not be that useful, since an application that relies on
>>>>> a server to assure ordering of commits will need to at least be able to
>>>>> tell commits from other things.  Of course, even if the content type is
>>>>> encrypted by default, the application can add it back in authenticated
>>>>> data, or in some unauthenticated wrapper.
>>>>>
>>>>>
>>>>>
>>>>> Net of those considerations, I'm personally still inclined to merge
>>>>> the PR, to have a conservative baseline.  But those on the call thought it
>>>>> would be useful to pose this to the group, so please speak up if you have
>>>>> concerns.
>>>>>
>>>>>
>>>>>
>>>>> --Richard
>>>>>
>>>>>
>>>>>
>>>>> [1] https://github.com/mlswg/mls-protocol/pull/349
>>>>>
>>>>> _______________________________________________
>>>>> MLS mailing list
>>>>> MLS@ietf.org
>>>>> https://www.ietf.org/mailman/listinfo/mls
>>>>>
>>>>>
>>