Re: [rtcweb] Interaction between MediaStream API and signaling

Justin Uberti <> Mon, 02 April 2012 03:17 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 4B82E11E8085 for <>; Sun, 1 Apr 2012 20:17:11 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -99.821
X-Spam-Status: No, score=-99.821 tagged_above=-999 required=5 tests=[BAYES_05=-1.11, FM_FORGED_GMAIL=0.622, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1, SARE_HTML_USL_OBFU=1.666, USER_IN_WHITELIST=-100]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id ETDjAhtJFNYi for <>; Sun, 1 Apr 2012 20:17:10 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 277BF11E80A3 for <>; Sun, 1 Apr 2012 20:17:09 -0700 (PDT)
Received: by qafi31 with SMTP id i31so1812024qaf.15 for <>; Sun, 01 Apr 2012 20:17:09 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:x-system-of-record; bh=fStNSKNaw8SWHnXrJtXPwX4siN28CLlyRstUg2eUeKU=; b=Z5O8NUMngYBEgOM8nI8759WCxxu8u7a+xPKzT41g2sRR1fg3zTKKGy1Lng2hFqLvk2 GFJcsyI3ZDR874IoFK4I5JCPHnTNuEJIcm3+4pW5vkr8/iABQ5mdtEeEvCeLYqpf4RmM SPL/txjz91glnN3ip9j6j5443MKE5tIuNNYxDmNdH6eUkvFrQMubnTq/YxTf/HYjfP6e eJrzHOmENf+2sBRska/4QqSWrxaeZSNb+I1f8pVWODjC32fgz6zUS1qYKK2eZgYYy4HW ZYjfcv0yiXsC01cl1LkSy7tUUYFy3jfY7t7j516pQyM7olphM20jXY/csFgDSYfuv4u2 Zgxg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:x-system-of-record:x-gm-message-state; bh=fStNSKNaw8SWHnXrJtXPwX4siN28CLlyRstUg2eUeKU=; b=Y6UCcendWCLzjyjAGFP9m1CrsiQ9EFGTs0JO8WKlvl0HQe8QQ1kOT34uVC6zOrKOkk 1nPKseoef3T93OUIlvlfCA1V8dnHSopDCP7tcpS7NDVMylYohM4K+pi1NhsnG75xgEMo 4kuAgLSFP2aIYkLqnrnD+Ifb1oRrWsqD5hQmWKLQnasQqayn37Oh4FqDj9ei3MGmzRgf fhwU1YlZunkTabjcDm/DJas/d8bz7Rf8LcDg1L9zD36CvFG5Xj6o5TmLVAqHsRlD3Mfd mdOyK/3DpWpF2klcbnMH3sOOezZy0WanG7o9dMzaQr2knAJbokbsznY5f5d5+9/9APjL ySqQ==
Received: by with SMTP id hr3mr9084913qab.26.1333336629571; Sun, 01 Apr 2012 20:17:09 -0700 (PDT)
Received: by with SMTP id hr3mr9084904qab.26.1333336629442; Sun, 01 Apr 2012 20:17:09 -0700 (PDT)
MIME-Version: 1.0
Received: by with HTTP; Sun, 1 Apr 2012 20:16:49 -0700 (PDT)
In-Reply-To: <>
References: <> <> <> <>
From: Justin Uberti <>
Date: Sun, 01 Apr 2012 23:16:49 -0400
Message-ID: <>
To: Stefan Hakansson LK <>
Content-Type: multipart/alternative; boundary="20cf3005dc4029f92c04bca9a0be"
X-System-Of-Record: true
X-Gm-Message-State: ALoCoQksvaoQZSdcN1cktv3xb6+nAENAOYp5Tv3qzHiyIpQZD+SxcOSMGZeubytFO75RL7TUiII0B+7mQsZ04XtzNTo51EV1dS8IkSG5Vwa5F9PrhcDMCSCiNAfbNg2WZqzrF7PnmRGE
Subject: Re: [rtcweb] Interaction between MediaStream API and signaling
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 02 Apr 2012 03:17:11 -0000

On Sat, Mar 31, 2012 at 1:41 AM, Stefan Hakansson LK <> wrote:

> On 03/31/2012 07:17 AM, Stefan Hakansson LK wrote:
>> On 03/30/2012 11:39 PM, Randell Jesup wrote:
>>> On 3/30/2012 4:59 AM, Stefan Hakansson LK wrote:
>>>> The JS API has deals with MediaStreams (this is what you send and
>>>> receive using PeerConnection from an application perspective).
>>>> A browser receiving RTP streams, needs side info to be able to
>>>> assemble those RTP streams into MediaStreams in a correct way. The
>>>> current model is that this is signaled using SDP exchanges (where
>>>> Haralds MSID proposal would tell which MediaStream an RTP stream
>>>> belongs to).
>>>> As I brought up at the mike yesterday, I think we may have a race
>>>> condition for the responder.
>>>> For the initiator side browser, this is clear: once an (PR-)ANSWER is
>>>> received, the responder has received the SDP, and hence can map
>>>> incoming RTP streams into MediaStreams.
>>>> But for the responder side this is less clear to me. Imagine
>>>> applications where the responder just mirrors the initiator - if one
>>>> of the parties adds a MediaStream to PeerConnection, the other end
>>>> would add the corresponding MediaStream.
>>>> This can happen any time in the session, so ICE can very well be up
>>>> and running. One example could be that the data channel is used for
>>>> text chat, when one side clicks a button to start video. And the
>>>> application can have asked for permission to use all input devices
>>>> earlier, so no user interaction may be involved.
>>>> In this situation the responder's (added) RTP streams can very well
>>>> arrive before the ANSWER if I understand correctly.
>>> Yes.  Just like in SIP.  And so when you send an OFFER (or modified
>>> re-OFFER), you must be ready to receive data per that offer even if no
>>> ANSWER has been received - just like in SIP.  And if its a re-offer, you
>>> need to accept the old, and accept the new (though you could probably
>>> use reception of obviously new-OFFER media to turn off
>>> decoding/rendering old-OFFER in preparation for the ANSWER).
>>> The flip side of this is the responder has to infer when the sender
>>> switches over to the result of the ANSWER from the media.  For example:
>>> A                                      B
>>> <--- H.261 --->
>>> re-OFFER(VP8) --->
>>> <-- ANSWER(VP8) (delayed in reception)
>>> <-----------VP8            (A should infer that B ANSWERed and accepted
>>> VP8)
>>>    ---------->   H.261
>>> <-- ANSWER(VP8) (received)
>>> <--------VP8---------->   (B should infer by reception of VP8 that ANSWER
>>> was received)
>>> (Personally, I hate inferences, but without a 3 (or 4) way handshake,
>>> you have to).  If you switches of codecs are staged, then this isn't
>>> (much) of a problem.  Either leave old codec on the list, or leave it on
>>> the list until accept, and then re-OFFER to remove the un-used codec.
>> I think I understand what you mean, and this would work fine as long as
>> you just switch codecs that are used in already set-up MediaStreams.
>> But if A in this case, as part of re-OFFERING the session, not only
>> offers a new codec (VP8) for the already flowing video but also adds a
>> new outgoing video stream (e.g. front cam), and then (without receiving
>> the ANSWER - delayed in reception) starts receiving VP8 video it could
>> not really know if this VP8 video is new video from the responders front
>> cam or just a new codec for the existing (back cam) video from the
>> responder to the sender.
> This may have been a very bad example. Probably you can tell them apart on
> the SSRC. But even so, the A browser won't know what the VP8 stream (if it
> has an unknown SSRC) represents without receiving the ANSWER.

I think this is only an issue if you decide to add streams in the ANSWER.
But even so, eventually the ANSWER arrives and you can start
demuxing/decoding appropriately.

Regardless, if the app wants to require some sort of ACK message to begin
transmission, or perhaps require that the remote side ask for the media
streams it wants to be sent, I think this could be implemented in the
app-specific signaling layer; the streams could initially be added as
inactive, and only changed to sendrecv when the ACK arrives.

>>> One problem is what to do in the switchover window when you might get a
>>> mixture of old and new media, especially if you moved them to different
>>> ports and so can't count on RTP sequence re-ordering to un-mix them; in
>>> the past I dealt with that (and long codec-switch times) by locking out
>>> codec changes for a fraction of a second after I do one.  Not a huge
>>> deal, however.
>>> My apologies if I've missed something in JSEP; I've been heads-down
>>> enough in Data Channels and bring-up that I could have a disconnect here
>>> and be saying something silly.
>> Actually I don't think this is very JSEP related; it is the generic
>> problem that the browser receiving RTP streams need some side info about
>> them before being able to do anything sensible with them.
>>>  I think we need to find a way to handle this. One way is to add an
>>>> "ACK" that indicates to the responder that the initiator has received
>>>> the ANSWER, but I'm not sure that is the best way.
>>> If you need to know that, you need a SIP-style ACK.
>> As explained, I do think we need to know that.
>> ______________________________**_________________
>> rtcweb mailing list
> ______________________________**_________________
> rtcweb mailing list