Re: [MMUSIC] Handling of unverified data and media

Jonathan Lennox <> Mon, 13 March 2017 22:17 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id B5C02129518 for <>; Mon, 13 Mar 2017 15:17:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -0.564
X-Spam-Status: No, score=-0.564 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_SORBS_WEB=2.035, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id rmqVj48JIQKr for <>; Mon, 13 Mar 2017 15:17:25 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id DBF66129BB5 for <>; Mon, 13 Mar 2017 15:17:21 -0700 (PDT)
Received: from pps.filterd ( []) by ( with SMTP id v2DM956R005177; Mon, 13 Mar 2017 18:17:16 -0400
Received: from ([]) by with ESMTP id 294cv89nuf-1 (version=TLSv1 cipher=AES128-SHA bits=128 verify=NOT); Mon, 13 Mar 2017 18:17:15 -0400
Received: from ([fe80::50:56ff:fe85:4f77]) by ([fe80::50:56ff:fe85:6b62%13]) with mapi id 14.03.0195.001; Mon, 13 Mar 2017 17:17:15 -0500
From: Jonathan Lennox <>
To: Christer Holmberg <>
Thread-Topic: [MMUSIC] Handling of unverified data and media
Thread-Index: AQHSmfBaeuX9ya5EIEOzUwYnxet2SaGOsIkAgAABT4CAAA4wAIAA+MdggAO42wD//9++UIAAXakA
Date: Mon, 13 Mar 2017 22:17:14 +0000
Message-ID: <>
References: <> <> <> <> <> <> <> <> <>
In-Reply-To: <>
Accept-Language: en-US
Content-Language: en-US
x-originating-ip: []
Content-Type: multipart/alternative; boundary="_000_B471CDFFD0E84644B8EB320694353412vidyocom_"
MIME-Version: 1.0
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:, , definitions=2017-03-13_16:, , signatures=0
X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1702020001 definitions=main-1703130171
Archived-At: <>
Cc: Flemming Andreasen <>, Harald Alvestrand <>, mmusic <>
Subject: Re: [MMUSIC] Handling of unverified data and media
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: Multiparty Multimedia Session Control Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 13 Mar 2017 22:17:28 -0000

On Mar 13, 2017, at 5:44 PM, Christer Holmberg <<>> wrote:


If the answerer is in such a “hurry” to start sending media, one would think it also makes sure the answer is sent as early as possible, so that media can flow in both directions.

You can send the answer as early as possible, but if two endpoints on the same subnet are talking via a signaling server on the other side of the planet, there’s only so much you can do.

My question is: is this something that’s causing problems in real deployments, and requires a change in the standard? To me it seems like something that is very unlikely to occur, or at least like something that can easily be avoided by implementation means…

I’m pretty sure this is something that’s already allowed by the current standards, but this fact is substantially non-obvious. So I’m afraid that we’ll get interop failures if we don’t point it out.

Note that in most cases it can only happen if endpoints take advantage of the freedoms granted by RFC 5245bis’s “passive-aggressive” algorithm, where we clarified that an endpoint is allowed to send on any valid pair, not just the selected nominated pair.  (If you don’t send until you have a nominated pair, I believe the only way this race can occur is if an ICE-Lite endpoint calls a Full endpoint; otherwise, the offerer is the controlling endpoint.)

I’m not sure what “implementation means” you’re thinking of, though.  Avoiding sending media/DTLS until you receive an incoming connectivity check?



From: Jonathan Lennox []
Sent: 13 March 2017 20:37
To: Christer Holmberg <<>>
Cc: Eric Rescorla <<>>; Bernard Aboba <<>>; Flemming Andreasen <<>>; Harald Alvestrand <<>>; mmusic <<>>
Subject: Re: [MMUSIC] Handling of unverified data and media

On Mar 11, 2017, at 9:52 AM, Christer Holmberg <<>> wrote:


Is this a theoretical issue?

At least if you use ICE, you are going to receive the answer before you receive any media, as you are going to do the connectivity checks etc.

No, even with ICE it’s possible for media or the DTLS handshake to outrace the answer.

This is because ICE offering endpoints respond to connectivity checks before they receive an answer.  When the answerer receives this successful connectivity check response, it puts the relevant pair in the Valid list, and then (if it has the active role, as recommended) can legitimately initiate DTLS on this pair.

If two ICE endpoints have a short RTT and clear connectivity between them, but a long RTT to their signaling server, this can happen quite easily.

Also, in reality some implementations will not accept content before the answer arrives – no matter if DTLS is used or not – so the best thing is to, once the answer has been sent, just wait for a while before sending any content.

Fortunately, DTLS has retransmissions, so this shouldn’t cause failure, just a brief setup delay.



From: mmusic [] On Behalf Of Eric Rescorla
Sent: 11 March 2017 02:57
To: Bernard Aboba <<>>
Cc: Flemming Andreasen <<>>;<>; mmusic WG <<>>
Subject: Re: [MMUSIC] Handling of unverified data and media

Sorry, no, I was just talking about what might or might not be safe.... The doc text is
a different question.


On Fri, Mar 10, 2017 at 4:05 PM, Bernard Aboba <<>> wrote:
EKR said:

"I haven't spent too much time on it, but it seems like it ought to be safe to hold
anything you receive prior to getting the fingerprint. It might be better, as MT
suggests, to discard the datachannel data, but I'm not sure why it would be

[BA] So you are saying that the MUST NOT allows the browser to buffer data/media but not to pass it to the application (in the case of the data channel) or to play it out?

On Fri, Mar 10, 2017 at 4:01 PM, Eric Rescorla <<>> wrote:
I haven't spent too much time on it, but it seems like it ought to be safe to hold
anything you receive prior to getting the fingerprint. It might be better, as MT
suggests, to discard the datachannel data, but I'm not sure why it would be


On Fri, Mar 10, 2017 at 2:47 PM, Roman Shpount <<>> wrote:
My assumption always was that data is received, decoded and discarded until fingerprint is received and verified. This way DTLS handshake completes, key frames are decoded, but user is nor presented with any unverified media.


Roman Shpount

On Thu, Mar 9, 2017 at 6:58 PM, Martin Thomson <<>> wrote:
I think that the data channel question is easy, anything other than a
"no" is not acceptable.  Data in that form enters the security
boundary for an origin and it doesn't make any sense to risk attack
there.  (It's also likely unnecessary, if a half a round trip of
signaling is slower than 5 round trips on the media path, then
something is messed up.)

I'm in two minds about the media part. For media, you could also
reasonably make the same origin-purity argument.  I'm inclined to say
that.  But we CAN isolate media from the origin (and we definitely
should if we allow this).

So, the media that arrives had to comply with your offer.  The DTLS
handshake also has to complete, which tells the receiver whether the
media needs to be confidential or not (at which point you can disable
this feature).

It's also possible that a receiver can require that an ICE
connectivity check was made (though this is inbound only, and I'm
unclear on whether having received an inbound check would normally
prevent the receiver from accepting a packet).

All told, that's a lot of information about the negotiated session for
an attacker to have.  The odds of this being an attack would *seem* to
be low.

On the other hand, we don't assume confidentiality of signaling; the
security model assumes that all this information is effectively public
and the protection we have against attack is the certificate
fingerprint.  This would remove that protection, albeit for a short

I have an extra question: does anyone plan to implement this?  It's
non-trivial.  I think that I know what I'd need to do in Firefox and
it would be quite disruptive.  Before committing to do that work
(which I will leave to others closer to this to decide), I'd probably
want more information on the actual advantage that it provides.

On 10 March 2017 at 07:10, Bernard Aboba <<>> wrote:
> In the W3C WEBRTC WG, an issue has been submitted relating to playout of
> unverified media:
> It has been suggested that if the browser is configured to do so, that
> playout be allowed for a limited period (e.g. 5 seconds) prior to
> fingerprint verification:
> Section 6.2 of draft-ietf-mmusic-4572-update-13 contains the following text,
> carried over from RFC 4572:
>    Note that when the offer/answer model is being used, it is possible
>    for a media connection to outrace the answer back to the offerer.
>    Thus, if the offerer has offered a 'setup:passive' or 'setup:actpass'
>    role, it MUST (as specified in RFC 4145 [7]) begin listening for an
>    incoming connection as soon as it sends its offer.  However, it MUST
>    NOT assume that the data transmitted over the TLS connection is valid
>    until it has received a matching fingerprint in an SDP answer.  If
>    the fingerprint, once it arrives, does not match the client's
>    certificate, the server endpoint MUST terminate the media connection
>    with a bad_certificate error, as stated in the previous paragraph.
> Given the outstanding issue relating to handling of unverified media, the
> Chairs of the W3C WEBRTC WG would like to request clarification from the
> IETF MMUSIC WG as to the meaning of the "MUST NOT" in the above paragraph.
> In particular, what is it permitted for an implementation to do with
> received data and media prior to verification? For example:
>      1. May data received over the data channel be provided to the
> application prior to verification?
>          a. If the answer to the above is "no", may unverified received data
> be delivered by the DTLS transport to SCTP, which may buffer it?
>      2. May received media be played out prior to verification?
> Bernard Aboba
> On behalf of the W3C WEBRTC WG
> _______________________________________________
> mmusic mailing list

mmusic mailing list<>

mmusic mailing list<>

mmusic mailing list<>

mmusic mailing list<>