[MMUSIC] Benjamin Kaduk's Discuss on draft-ietf-mmusic-sdp-uks-06: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Mon, 05 August 2019 16:27 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: mmusic@ietf.org
Delivered-To: mmusic@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 74E9D12018F; Mon, 5 Aug 2019 09:27:56 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: "The IESG" <iesg@ietf.org>
Cc: draft-ietf-mmusic-sdp-uks@ietf.org, Bo Burman <bo.burman@ericsson.com>, mmusic-chairs@ietf.org, bo.burman@ericsson.com, mmusic@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.99.1
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <156502247647.24440.17878436939662954486.idtracker@ietfa.amsl.com>
Date: Mon, 05 Aug 2019 09:27:56 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/mmusic/XeYUeXexGWIF50_ptgXaeyNhagc>
Subject: [MMUSIC] Benjamin Kaduk's Discuss on draft-ietf-mmusic-sdp-uks-06: (with DISCUSS and COMMENT)
X-BeenThere: mmusic@ietf.org
X-Mailman-Version: 2.1.29
List-Id: Multiparty Multimedia Session Control Working Group <mmusic.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mmusic>, <mailto:mmusic-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mmusic/>
List-Post: <mailto:mmusic@ietf.org>
List-Help: <mailto:mmusic-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mmusic>, <mailto:mmusic-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 05 Aug 2019 16:27:57 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-mmusic-sdp-uks-06: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)

Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about IESG DISCUSS and COMMENT positions.

The document, along with other ballot positions, can be found here:


There are both pretty minor points, in the grand scheme of things, but I
do think it would be hazardous to publish the document without
addressing them.

The semantics surrounding the "external_id_hash" TLS extension seem
insufficiently specified to admit interoperable implementation.  In
Section 3.2 we read that it "carries a hash of the identity assertion that
communicating peers have exchanged", as if there was a single
distinguished identity assertion for the session.  But, if we read on,
we learn that there is not one identity assertion, but (in the general
case) two, one for each party, and that what seems to actually be
intended is that each party sends the hash of the identity assertion
corresponding to the sender's identity, with the requirements to send an
empty external_id_hash if the party in question is not providing
identity bindings.  Additionally, the text about having an empty
"external_id_hash" extension in ClientHello or
ServerHello/EncryptedExtensions is written in a way that implies that
all parties generate a ClientHello and all parties generate a
ServerHello or EncryptedExtensions message, whereas these are actually
conditional on whether the party is acting as (D)TLS client or server.

Similarly, the current text for the last sentence of Section 3.2 ("In
TLS 1.3, the "external_id_hash" extension MUST be sent in the
EncryptedExtensions message.") can be (mis)read as implying that all
EncryptedExtensions messages sent by TLS servers that implement this
specification must include this extension, which would violate the TLS
extension-negotiation model since it mandates the server sending an
extension without regard to the client having indicated support for the
extension.  Perhaps "MUST NOT be sent in the TLS 1.3 ServerHello message"
conveys the restriction more clearly?
(A similar comment applies to the corresponding statement in Section
4.3, which interestingly enough already has a "In TLS 1.3, the
"external_session_id" extension MUST NOT be included in a ServerHello."
disclaimer in addition to the problematic sentence.)


Section 2

   The attacker obtains an identity assertion for an identity it
   controls, but binds that to the fingerprint of one peer.  The
   attacker is then able to cause a TLS connection to be established
   where two endpoints communicate.  The victim that has its fingerprint
   copied by the attack correctly believes that it is communicating with
   the other victim; however, the other victim incorrectly believes that
   it is communicating with the attacker.

nit: maybe this could be reworded for improved clarity.  Perhaps, "two
endponts other than the attacker communicate" or "two victim endpoints".

   A similar attack can be mounted without any communications
   established based on the SDP "fingerprint" attribute [FINGERPRINT].

At this point in the document, I don't know how to interpret "without
any communications established based on".

Section 2.1

   1.  An attacker can only modify the parts of the session signaling
       for a session that they are part of, which is limited to their
       own offers and answers.

nit(?): the first part of the sentence suggests that the attacker can
modify their peers' offers/answers, and it's not entirely clear (from a
rhetorical sense) how the latter clause is supposed to relate to the

   The combination of these two constraints make the spectrum of
   possible attacks quite limited.  An attacker is only able to switch
   its own certificate fingerprint for a valid certificate that is
   acceptable to its peer.  Attacks therefore rely on joining two
   separate sessions into a single session.

nit: It's not clear to me (at this point in the document) whether this
is "victim A connects to attacker and also to victim B, and attacker
merges the first session into the second", or "victim A connects to
attacker and attacker connects to victim B, and attacker splices the two
together and steps out of the way".  (I assume the latter, but the text
hasn't clarified it yet.)

Section 2.3

   Third-party call control (3PCC) [RFC3725] is a technique where a
   signaling peer establishes a call that is terminated by a different
   entity.  This attack is very similar to the 3PCC technique, except
   where the TLS peers are aware of the use of 3PCC.

nit: Rhetorically-wise, I don't know what "except" is intended to mean
here.  Is the attack like 3PCC but in normal 3PCC the peers are unaware
of 3PCC use and in the attack they are?  The other way around?  ("except
that in the 3PCC case the TLS peers are aware of its use" would
disambiguate fine, I think.)

   It is understood that this technique will prevent the use of 3PCC if
   peers have different views of the involved identities, or the value
   of SDP "tls-id" attributes.

nit: understood by whom?  (I don't think that we need "It is understood
that" at all.)

Section 3

   The identity assertions used for WebRTC (Section 7 of [WEBRTC-SEC])
   and the SIP PASSPoRT using in SIP identity ([SIP-ID], [PASSPoRT]) are
   bound to the certificate fingerprint of an endpoint.  An attacker

nit: s/using/used/

   causes an identity binding to be created that binds an identity they
   control to the fingerprint of a first victim.

nit: I think we want "An attacker can cause" or "In an unknown-key-share
attack, an attacker causes".

   really talking to the first victim.  The attacker only needs to
   create an identity assertion that covers a certificate fingerprint of
   the first victim.

Well, and actually cause the traffic to shuffle around so the victims
are sending/receiving from each other.

   The problem might appear to be caused by the fact that the authority
   that certifies the identity binding is not required to verify that
   the entity requesting the binding controls the keys associated with
   the fingerprints.  Neither SIP nor WebRTC identity providers are not
   required to perform this validation.  However, validation of keys by
   the identity provided is not relevant because verifying control of
   the associated keys is not a necessary condition for a secure
   protocol, nor would it be sufficient to prevent attack [SIGMA].

nit: in the last sentence, I'm not sure that "validation of keys by the
identity provided" is correct; "identity provider" would make more

   This form of unknown key-share attack is possible without
   compromising signaling integrity, unless the defenses described in

nit: I'd suggest s/possible/even possible/

   Section 4 are used.  Endpoints MUST use the "external_session_id"
   extension (see Section 4.3) in addition to the "external_id_hash"
   (Section 3.2) so that two calls between the same parties can't be
   altered by an attacker.

nit(?): These normative requirements kind of come out of nowhere, in
terms of the flow of language.  Maybe "In order to prevent this attack,
endpoints MUST", or just move the normative requirements closer to the
mechanisms themselves?

Section 3.2

   A WebRTC identity assertion is provided as a JSON [JSON] object that
   is encoded into a JSON text.  The resulting string is then encoded
   using UTF-8 [UTF8].  The content of the "external_id_hash" extension

I don't really understand the separate UTF-8 step -- RFC 8259 already
requires text to be UTF-8 encoded.

I think this section would be easier to read if the different cases of
identity encoding/transmission were broken out into a bulleted or
enumerated list (the latter might make it easier to extend in the
future): right now I think we have (1) pure WebRTC, (2) SDP "identity",
and (3) SIP PASSPoRT, but I'm not 100% sure I'm reading the text

If that's done, it would also be a good opportunity to clarify that the
note about hash agility applies to the TLS extension as a whole, not
just the PASSPoRT case.

   Where a PASSPoRT is used, the compact form of the PASSPoRT MUST be
   expanded into the full form.  The base64 encoding used in the SIP

nit: this is written to assume that only compact PASSPoRTs will ever be
used, which IIUC is not the case.

                                                    This allows its peer
   to include a hash of its identity binding.  An endpoint without an
   identity binding MUST include an empty "external_id_hash" extension
   in its ServerHello or EncryptedExtensions message, to indicate
   support for the extension.

nit: and that it has validated the client's identity binding?

   A peer that receives an "external_id_hash" extension that does not
   match the value of the identity binding from its peer MUST
   immediately fail the TLS handshake with an error.  This includes
   cases where the binding is absent, in which case the extension MUST
   be present and empty.

nit: I'd suggest rewording the second sentence as follows (since the
conditional logic on "extension present but binding absent" could be

% The absence of an identity binding does not relax this requirement --
% an extension received when the peer has not provided an identity
% binding on the signalling layer must still be validated to have the
% zero-length extension body.

   A peer that receives an identity binding, but does not receive an
   "external_id_hash" extension MAY choose to fail the connection,
   though it is expected that implementations written prior to the
   definition of the extensions in this document will not support both
   for some time.

nit: I don't think the comma after "binding" is needed.
Also, is the "not" intended?  I'm not entirely sure what "both" is
intended to refer to.

Section 4

nit(?): There's an annoying lack of parallelism in the Section titles
for Sections 3 and 4, though I don't have a good suggestion for Section
4's title -- "Attack on Raw Fingerprints" is the best I can do right

   Even if the integrity session signaling can be relied upon, an

nit: s/integrity session signaling/session signaling integrity/?

Section 4.1

   another honest endpoint.  The attacker convinces the endpoint that
   their session has completed, and that the session with the other
   endpoint has succeeded.

Even with the benfit of the figure, I'm not sure I am properly
understanding the distinction between "completed" and "succeeded".  Is
the idea that the "completed" session finishes a DTLS handshake and then
immediately hangs up?  Or is this entirely at the signalling layer?

                 For this reason, it might be necessary to permit the
   signaling from Patsy to reach Norma to allow Patsy to receive a call
   setup completion signal, such as a SIP ACK.  Once the second session
   is established, Mallory might cause DTLS packets sent by Norma to
   Patsy to be dropped.  It is likely that these DTLS packets will be
   discarded by Patsy as Patsy will already have a successful DTLS
   connection established.

nit: Is this "it is likely these packets would be discarded even if
Mallory lets them through"?

   This attack creates an asymmetry in the beliefs about the identity of
   peers.  However, this attack is only possible if the victim (Norma)
   is willing to conduct two sessions nearly simultaneously, if the
   attacker (Mallory) is on the network path between the victims, and if
   the same certificate - and therefore SDP "fingerprint" attribute
   value - is used in both sessions.

This is the same certificate used by Norma in both sessions, right?

Section 4.3

   Where RTP and RTCP [RTP] are not multiplexed, it is possible that the
   two separate DTLS connections carrying RTP and RTCP can be switched.
   This is considered benign since these protocols are designed to be
   distinguishable.  RTP/RTCP multiplexing is advised to address this

What does "switched" mean?  That Mallory could swap the data contents
around as an active MITM?

   This defense is not effective if an attacker can rewrite "tls-id"
   values in signaling.  Only the mechanism in "external_id_hash" is
   able to defend against an attacker that can compromise session

Please help me check my understanding: in terms of just the operation of
the TLS extensions, "external_id_hash" and "external_session_id" provide
similar protection, in that they are just validating that what's in the
TLS handshake matches what's in the signalling layer.  The added
protection from "external_id_hash" only comes when the endpoints
actually go and contact the peers' IdP to validate the identity
assertions that are transmitted in the signalling layer.
If my understanding is correct, we should probably add a bit more text
here indicating the need for more validation than just the validation of
the TLS extension contents that this document describes.

Section 5

   In the absence of any higher-level concept of peer identity, the use
   of session identifiers does not prevent session concatenation if the
   attacker is able to copy the session identifier from one signaling
   session to another.  This kind of attack is prevented by systems that
   enable peer authentication such as WebRTC identity [WEBRTC-SEC] or
   SIP identity [SIP-ID].  However, session concatenation remains
   possible at higher layers: an attacker can establish two independent
   sessions and simply forward any data it receives from one into the

And in such a case the attacker has access to the media plaintext, too,

   Use of the "external_session_id" does not guarantee that the identity
   of the peer at the TLS layer is the same as the identity of the
   signaling peer.  The advantage an attacker gains by concatenating
   sessions is limited unless it is assumed that signaling and TLS peers
   are the same.  If a secondary protocol uses the signaling channel
   with the assumption that the signaling and TLS peers are the same
   then that protocol is vulnerable to attack unless they also validate
   the identity of peers at both layers.

Is this paragraph describing a case like in
draft-ietf-rtcweb-security-arch, where we do send (and verify) identity
assertions at the signalling layer?  That is, the verification at the
IdP counts for validation at the "secondary protocol" layer and the
verification that the TLS extension matches the signalling constitutes
verification at the TLS layer, thereby achieving the validation "at both

   It is important to note that multiple connections can be created
   within the same signaling session.  An attacker might concatenate
   only part of a session, choosing to terminate some connections (and
   optionally forward data) while arranging to have peers interact
   directly for other connections.  It is even possible to have
   different peers interact for each connection.  This means that the
   actual identity of the peer for one connection might differ from the
   peer on another connection.

How do or could the mitigations specified in this document address these

Section 8.2

I don't see how [BASE64] is only informative; we require base64-decoding
for some of the procedures.