Re: [rtcweb] Interaction between MediaStream API and signaling

Randell Jesup <> Fri, 30 March 2012 21:42 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id DF46D21F85D2 for <>; Fri, 30 Mar 2012 14:42:46 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.293
X-Spam-Status: No, score=-2.293 tagged_above=-999 required=5 tests=[AWL=0.306, BAYES_00=-2.599]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id XNqh42IuFGAm for <>; Fri, 30 Mar 2012 14:42:46 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 4523A21F85D1 for <>; Fri, 30 Mar 2012 14:42:45 -0700 (PDT)
Received: from ([] helo=[]) by with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.69) (envelope-from <>) id 1SDjam-0008BQ-HD for; Fri, 30 Mar 2012 16:42:44 -0500
Message-ID: <>
Date: Fri, 30 Mar 2012 17:39:31 -0400
From: Randell Jesup <>
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:11.0) Gecko/20120312 Thunderbird/11.0
MIME-Version: 1.0
References: <>
In-Reply-To: <>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname -
X-AntiAbuse: Original Domain -
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain -
Subject: Re: [rtcweb] Interaction between MediaStream API and signaling
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 30 Mar 2012 21:42:47 -0000

On 3/30/2012 4:59 AM, Stefan Hakansson LK wrote:
> The JS API has deals with MediaStreams (this is what you send and 
> receive using PeerConnection from an application perspective).
> A browser receiving RTP streams, needs side info to be able to 
> assemble those RTP streams into MediaStreams in a correct way. The 
> current model is that this is signaled using SDP exchanges (where 
> Haralds MSID proposal would tell which MediaStream an RTP stream 
> belongs to).
> As I brought up at the mike yesterday, I think we may have a race 
> condition for the responder.
> For the initiator side browser, this is clear: once an (PR-)ANSWER is 
> received, the responder has received the SDP, and hence can map 
> incoming RTP streams into MediaStreams.
> But for the responder side this is less clear to me. Imagine 
> applications where the responder just mirrors the initiator - if one 
> of the parties adds a MediaStream to PeerConnection, the other end 
> would add the corresponding MediaStream.
> This can happen any time in the session, so ICE can very well be up 
> and running. One example could be that the data channel is used for 
> text chat, when one side clicks a button to start video. And the 
> application can have asked for permission to use all input devices 
> earlier, so no user interaction may be involved.
> In this situation the responder's (added) RTP streams can very well 
> arrive before the ANSWER if I understand correctly. 

Yes.  Just like in SIP.  And so when you send an OFFER (or modified 
re-OFFER), you must be ready to receive data per that offer even if no 
ANSWER has been received - just like in SIP.  And if its a re-offer, you 
need to accept the old, and accept the new (though you could probably 
use reception of obviously new-OFFER media to turn off 
decoding/rendering old-OFFER in preparation for the ANSWER).

The flip side of this is the responder has to infer when the sender 
switches over to the result of the ANSWER from the media.  For example:

A                                      B
<--- H.261 --->
re-OFFER(VP8) --->
<-- ANSWER(VP8) (delayed in reception)
<-----------VP8            (A should infer that B ANSWERed and accepted VP8)
  ----------> H.261
<-- ANSWER(VP8) (received)
<--------VP8----------> (B should infer by reception of VP8 that ANSWER 
was received)

(Personally, I hate inferences, but without a 3 (or 4) way handshake, 
you have to).  If you switches of codecs are staged, then this isn't 
(much) of a problem.  Either leave old codec on the list, or leave it on 
the list until accept, and then re-OFFER to remove the un-used codec.

One problem is what to do in the switchover window when you might get a 
mixture of old and new media, especially if you moved them to different 
ports and so can't count on RTP sequence re-ordering to un-mix them; in 
the past I dealt with that (and long codec-switch times) by locking out 
codec changes for a fraction of a second after I do one.  Not a huge 
deal, however.

My apologies if I've missed something in JSEP; I've been heads-down 
enough in Data Channels and bring-up that I could have a disconnect here 
and be saying something silly.

> I think we need to find a way to handle this. One way is to add an 
> "ACK" that indicates to the responder that the initiator has received 
> the ANSWER, but I'm not sure that is the best way.

If you need to know that, you need a SIP-style ACK.

Randell Jesup