Re: [MMUSIC] Review: draft-ietf-mmusic-sdp-uks

Martin Thomson <> Tue, 30 October 2018 07:01 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id BA51C12DD85 for <>; Tue, 30 Oct 2018 00:01:25 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2
X-Spam-Status: No, score=-2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_FROM=0.001, RCVD_IN_DNSWL_NONE=-0.0001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (2048-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id wZkS6xUrbKS1 for <>; Tue, 30 Oct 2018 00:01:18 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:4864:20::343]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 8FC4713102B for <>; Tue, 30 Oct 2018 00:01:18 -0700 (PDT)
Received: by with SMTP id q25so10078817otn.12 for <>; Tue, 30 Oct 2018 00:01:18 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=0tgdI3pQtKndHKJ6lQgvdQhZpoFo+5Phl7cKSK0UP64=; b=eP5v8xJJZSoSVvmqNynSHJqAQEH1tpkNtKfAY0R6EiQp5o/H50dHrekY6qiMoEcX+S XodvIiOdO8SYHtb43RTBoFMk/r8j8VeR5LIeI6uBQ+NKaDtuAaVgM2smNU5u1hkVZa+b JnT0YSwbNFujy1prnbyBCpO2txSGYyl6JuZhQego+SHqDeWqFrf8/qIIs2SQho9TO7LF YLLDzbrfiLUy90kopnhWUoTaG82fa5P7ygQzl8ozikWpuXLoBnGr19ihudlRMkFp3IHt OtR8bQGL7IX1P/MP/4ruKLc7AEhsXlTaETyixcmKeLWwwYMM83/fzDHE3DVYD1CjJS7U z82g==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=0tgdI3pQtKndHKJ6lQgvdQhZpoFo+5Phl7cKSK0UP64=; b=Ca2z4r5/XqZ6Ajz4hj0UTofsv/HRcx8PDdM7PaaiYGvvKhKIscoSrH8O+1aaAMLInI 77elpyi/jfypAj2ra8TBof24pYbwE0w8JLHS+h92NlTSmYWdZ2OZojPRX3Ao5n17Glkc S7AnATQlodBgyhJzoe3wjjLZxdKUw/WjA6Q/7kGJq/TIcAh6PgTKu6bPp/3VgpzNfWsV Ffs/mp/JoFrKldrR4TZhmALVfPjTSdjxQp7BRz8+yPwZjL/B6A8Vd6+RYSYBIySIRL8u 6Fqkb5HTp8s4qKJzTlQ547W9esQiJTDXGaHIBJkjdHeES9MWXGpXpRYNRP3TEVarA/Hi V2Gg==
X-Gm-Message-State: AGRZ1gKCuCQCMZpnV7ARZdGinNGUwJIE/BSFb9p/wl11KC487S9igdVo gGXE3sKYv55RueOdMmR/62JN5Ur/GFe1hlTlwZc5rXH1+3g=
X-Google-Smtp-Source: AJdET5cu4arF3oMLPeXi6yBaqil/+eGuACg6hq12ndNlTTO0eDm44RffyucktIzaeoFhRsqn82TiLyJG9R6RtVTg9D0=
X-Received: by 2002:a9d:4185:: with SMTP id p5mr3170788ote.9.1540882877734; Tue, 30 Oct 2018 00:01:17 -0700 (PDT)
MIME-Version: 1.0
References: <>
In-Reply-To: <>
From: Martin Thomson <>
Date: Tue, 30 Oct 2018 18:01:08 +1100
Message-ID: <>
To: Adam Roach <>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <>
Subject: Re: [MMUSIC] Review: draft-ietf-mmusic-sdp-uks
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Multiparty Multimedia Session Control Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 30 Oct 2018 07:01:28 -0000

On Tue, Aug 28, 2018 at 9:16 AM Adam Roach <>; wrote:
> I was asked by the MMUSIC chairs to review draft-ietf-mmusic-sdp-uks.
> At a high level, I don't think the proposed fix works for SIP in current
> deployments; but I also doubt that the described attack can be credibly
> mounted in SIP. Specific details below.

On the first half, maybe there's a misunderstanding about what
properties are reasonable to assume.  You assume a far worse baseline
for what is deployed, which I agree is the case, but that doesn't
invalidate this technique completely.  More below.

On the second half, I wholeheartedly agree.  I frame this as the "most
boring attack ever" when I originally presented it.

> §2.2:
>  >  The use of TLS with SDP depends on the integrity of session
>  >  signaling.  Assuming signaling integrity limits the capabilities of
>  >  an attacker in several ways.
> This is generally a bad assumption. The penetration of TLS in SIP
> deployments is
> (by my understanding) incredibly low.

Just to be clear, this isn't addressing the TLS used on the signaling
path.  It's the use of TLS with SDP, which means media path.  I hope
that's clear enough and you are just making a point about the complete
lack of integrity for SIP signaling (which I agree is the case).

I have a similar understanding regarding TLS on the signaling side.
However, the security of a=fingerprint depends critically on integrity
at some level, even if we don't concretely have it. Maybe I'm
misunderstanding your point, I might infer an argument akin to "SIP is
a complete mess, so no point fixing SDP", but I

In any case, I would rather state it as a precondition/assumption and
move on.  RFC 8122 already makes the integrity point fairly pointedly:

   It is the responsibility of
   the encapsulating protocol to ensure the integrity of the SDP
   security descriptions.   --

Yes, the rest of that section goes on to mention key continuity
mechanisms and other such hand-waving, but in the end a=fingerprint
doesn't really work without integrity.

WebRTC identity and RFC 8224 (and friends) take steps to protect at
least the relationship between a=fingerprint and identities, which
goes a long way.  If those are used, it's a different story.  To the
extent that people deploy these things (we still hold hope for
PASSPoRT) we're OK with respect to our stated goals, even if the rest
of the signaling traverses hostile territory with protection less
effective than one of those little drink umbrellas.

> The unnumbered figure in this section (consider adding figure numbers)
> doesn't
> make sense to me.
>       Norma               Mallory             Patsy
>       (fp=N)               -----              (fp=P)
>         |                    |                  |
>       1 +---Offer1 (fp=N)--->|                  |
>       2 +-----Offer2 (fp=N)-------------------->|
>       3 |<--------------------Answer2 (fp=P)----+
>       4 |<--Answer1 (fp=P)---+                  |
>         |                    |                  |
>       5 |======DTLS1====>(Forward)====DTLS1====>|
>       6 |<=====DTLS2=====(Forward)<===DTLS2=====|
>       7 |======Media1===>(Forward)====Media1===>|
>       8 |<=====Media2====(Forward)<===Media2====|
>         |                    |                  |
>       9 |======DTLS2===========>(Drop)          |
>         |                    |                  |
> Presumably, DTLS1 corresponds to Offer1, while DLTS2 corresponds to Offer2?
> Please make that explicit.  Also, it took me a few beats to figure out that
> "fp=N" meant "Fingerprint = Norma," so it might be worth explaining that
> notation.
> Where I get lost, however, is around step 6 (refer to the step numbers
> above),
> which is described as:
>     Mallory also intercepts
>     packets from Patsy and forwards those to Norma at the transport
>     address that Norma associates with Mallory.
> This seems to introduce a much stronger requirement on the attacker; namely
> that they have the ability to intercept traffic bound for one of the
> victims.
> This should be discussed in section 2.2. I'll note that, for step 9 to work,
> the attacker must have the ability to cause packets to be dropped -- so
> merely
> being able to observe and inject traffic isn't sufficient. Mallory has to
> literally be on-path and effectively acting as network infrastructure.

Correct.  I had assumed that as part of the standard threat model:

   Applications protocol designers MUST NOT assume that all attackers
   will be off-path.   -- RFC 3552

But I will make that clear.  I don't believe that this is a general
constraint on attack here, just for this particular, implausible

> The other thing that confuses me is that this scenario doesn't describe a
> credible mechanism by which Mallory might compel Norma to engage in some
> really strange call behavior; namely, the initiation of two outbound calls
> simultaneously.

I think that with the framing you suggest (strengthen generically as
opposed to block a specific exigent threat), that doesn't matter very
much.  As in, the example only exists to illustrate the possibilities
available to an attacker, not to convince someone that their
deployment is broken and needs to be fixed post haste.

> In general, I don't think this works because of the issues involving SIP
> signaling integrity that I discuss above.

The analysis we did suggests that this is OK.  SIGMA shows us that
authenticated identity protocols don't as much depend on the strength
of the binding between identity and key, but depend more on the
identity being included in the session - and validated by peers.  That
is, you don't defend against unknown key share attacks by bolstering
the CA processes and having them check that keys are controlled by
applicants.  Instead, the protocol needs to facilitate the check.
It's a little counter-intuitive, but the explanation in the paper is
really good if you want to understand this better.

> I think you need to somehow
> bind the
> PASSPoRT identity into the DTLS signaling, rather than using yet another
> identifier that isn't bound to identity. I haven't worked through all the
> details here, but it probably looks a lot like what you've proposed for
> WebRTC.

This is right.  I didn't integrate PASSPoRT defenses originally
because that was in flux.  But the same technique we use for WebRTC
identity works for PASSPoRT.  The two are very similar in form, which
isn't accidental, so the same technique works perfectly.  I've updated
the draft for that.

The worst wrinkle there is that the the short form of a PASSPoRT
doesn't work.  That would be vulnerable to a duplicate signature key
selection attack, the likes of which hit ACME a few years ago (see
We'll need the full form.

> §3.1:
>  >  Endpoints MUST check that the "id" parameter in the extension that
>  >  they receive includes the "tls-id" attribute value that they received
>  >  in their peer's session description.  Comparison can be performed
>  >  with either the decoded ASCII string or the encoded octets.
> What is the distinction being made here between "decoded" and "encoded"
> forms?

I've tried to clarify.  In this case, most internal representations of
"string" and "sequence of octets" will have the same pattern of bits
in memory, so the risk of an actual problem is low, but there are
domain transformations in play here.

> I'm a little surprised not to see a discussion of cryptoagility in here.

Hadn't you heard?  It's a post-cryptographic-agility world now.  The
view here is that it is easier to define a whole new TLS extension
than it is to deal with the possibility that the codepoint you used to
signal which function you used has ossified.  I'll add a note.