Re: [rtcweb] Interaction between MediaStream API and signaling

Stefan Hakansson LK <> Sat, 31 March 2012 05:41 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id E0A8021F86B6 for <>; Fri, 30 Mar 2012 22:41:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -7.666
X-Spam-Status: No, score=-7.666 tagged_above=-999 required=5 tests=[AWL=-1.417, BAYES_00=-2.599, HELO_EQ_SE=0.35, RCVD_IN_DNSWL_MED=-4]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id VpDQDh1zKA6n for <>; Fri, 30 Mar 2012 22:41:07 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 5612E21F865C for <>; Fri, 30 Mar 2012 22:41:06 -0700 (PDT)
X-AuditID: c1b4fb2d-b7b76ae0000063d8-42-4f7698f0106f
Authentication-Results: x-tls.subject="/CN=esessmw0237"; auth=fail (cipher=AES128-SHA)
Received: from (Unknown_Domain []) (using TLS with cipher AES128-SHA (AES128-SHA/128 bits)) (Client CN "esessmw0237", Issuer "esessmw0237" (not verified)) by (Symantec Mail Security) with SMTP id AB.76.25560.0F8967F4; Sat, 31 Mar 2012 07:41:04 +0200 (CEST)
Received: from [] ( by ( with Microsoft SMTP Server id; Sat, 31 Mar 2012 07:41:03 +0200
Message-ID: <>
Date: Sat, 31 Mar 2012 07:41:03 +0200
From: Stefan Hakansson LK <>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120310 Thunderbird/11.0
MIME-Version: 1.0
References: <> <> <>
In-Reply-To: <>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Brightmail-Tracker: AAAAAA==
Subject: Re: [rtcweb] Interaction between MediaStream API and signaling
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 31 Mar 2012 05:41:09 -0000

On 03/31/2012 07:17 AM, Stefan Hakansson LK wrote:
> On 03/30/2012 11:39 PM, Randell Jesup wrote:
>> On 3/30/2012 4:59 AM, Stefan Hakansson LK wrote:
>>> The JS API has deals with MediaStreams (this is what you send and
>>> receive using PeerConnection from an application perspective).
>>> A browser receiving RTP streams, needs side info to be able to
>>> assemble those RTP streams into MediaStreams in a correct way. The
>>> current model is that this is signaled using SDP exchanges (where
>>> Haralds MSID proposal would tell which MediaStream an RTP stream
>>> belongs to).
>>> As I brought up at the mike yesterday, I think we may have a race
>>> condition for the responder.
>>> For the initiator side browser, this is clear: once an (PR-)ANSWER is
>>> received, the responder has received the SDP, and hence can map
>>> incoming RTP streams into MediaStreams.
>>> But for the responder side this is less clear to me. Imagine
>>> applications where the responder just mirrors the initiator - if one
>>> of the parties adds a MediaStream to PeerConnection, the other end
>>> would add the corresponding MediaStream.
>>> This can happen any time in the session, so ICE can very well be up
>>> and running. One example could be that the data channel is used for
>>> text chat, when one side clicks a button to start video. And the
>>> application can have asked for permission to use all input devices
>>> earlier, so no user interaction may be involved.
>>> In this situation the responder's (added) RTP streams can very well
>>> arrive before the ANSWER if I understand correctly.
>> Yes.  Just like in SIP.  And so when you send an OFFER (or modified
>> re-OFFER), you must be ready to receive data per that offer even if no
>> ANSWER has been received - just like in SIP.  And if its a re-offer, you
>> need to accept the old, and accept the new (though you could probably
>> use reception of obviously new-OFFER media to turn off
>> decoding/rendering old-OFFER in preparation for the ANSWER).
>> The flip side of this is the responder has to infer when the sender
>> switches over to the result of the ANSWER from the media.  For example:
>> A                                      B
>> <--- H.261 --->
>> re-OFFER(VP8) --->
>> <-- ANSWER(VP8) (delayed in reception)
>> <-----------VP8            (A should infer that B ANSWERed and accepted VP8)
>>     ---------->   H.261
>> <-- ANSWER(VP8) (received)
>> <--------VP8---------->   (B should infer by reception of VP8 that ANSWER
>> was received)
>> (Personally, I hate inferences, but without a 3 (or 4) way handshake,
>> you have to).  If you switches of codecs are staged, then this isn't
>> (much) of a problem.  Either leave old codec on the list, or leave it on
>> the list until accept, and then re-OFFER to remove the un-used codec.
> I think I understand what you mean, and this would work fine as long as
> you just switch codecs that are used in already set-up MediaStreams.
> But if A in this case, as part of re-OFFERING the session, not only
> offers a new codec (VP8) for the already flowing video but also adds a
> new outgoing video stream (e.g. front cam), and then (without receiving
> the ANSWER - delayed in reception) starts receiving VP8 video it could
> not really know if this VP8 video is new video from the responders front
> cam or just a new codec for the existing (back cam) video from the
> responder to the sender.

This may have been a very bad example. Probably you can tell them apart 
on the SSRC. But even so, the A browser won't know what the VP8 stream 
(if it has an unknown SSRC) represents without receiving the ANSWER.

>> One problem is what to do in the switchover window when you might get a
>> mixture of old and new media, especially if you moved them to different
>> ports and so can't count on RTP sequence re-ordering to un-mix them; in
>> the past I dealt with that (and long codec-switch times) by locking out
>> codec changes for a fraction of a second after I do one.  Not a huge
>> deal, however.
>> My apologies if I've missed something in JSEP; I've been heads-down
>> enough in Data Channels and bring-up that I could have a disconnect here
>> and be saying something silly.
> Actually I don't think this is very JSEP related; it is the generic
> problem that the browser receiving RTP streams need some side info about
> them before being able to do anything sensible with them.
>>> I think we need to find a way to handle this. One way is to add an
>>> "ACK" that indicates to the responder that the initiator has received
>>> the ANSWER, but I'm not sure that is the best way.
>> If you need to know that, you need a SIP-style ACK.
> As explained, I do think we need to know that.
> _______________________________________________
> rtcweb mailing list