Re: [rtcweb] Interaction between MediaStream API and signaling

Stefan Hakansson LK <stefan.lk.hakansson@ericsson.com> Sat, 31 March 2012 05:41 UTC

Return-Path: <stefan.lk.hakansson@ericsson.com>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E0A8021F86B6 for <rtcweb@ietfa.amsl.com>; Fri, 30 Mar 2012 22:41:08 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.666
X-Spam-Level:
X-Spam-Status: No, score=-7.666 tagged_above=-999 required=5 tests=[AWL=-1.417, BAYES_00=-2.599, HELO_EQ_SE=0.35, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VpDQDh1zKA6n for <rtcweb@ietfa.amsl.com>; Fri, 30 Mar 2012 22:41:07 -0700 (PDT)
Received: from mailgw1.ericsson.se (mailgw1.ericsson.se [193.180.251.45]) by ietfa.amsl.com (Postfix) with ESMTP id 5612E21F865C for <rtcweb@ietf.org>; Fri, 30 Mar 2012 22:41:06 -0700 (PDT)
X-AuditID: c1b4fb2d-b7b76ae0000063d8-42-4f7698f0106f
Authentication-Results: mailgw1.ericsson.se x-tls.subject="/CN=esessmw0237"; auth=fail (cipher=AES128-SHA)
Received: from esessmw0237.eemea.ericsson.se (Unknown_Domain [153.88.253.124]) (using TLS with cipher AES128-SHA (AES128-SHA/128 bits)) (Client CN "esessmw0237", Issuer "esessmw0237" (not verified)) by mailgw1.ericsson.se (Symantec Mail Security) with SMTP id AB.76.25560.0F8967F4; Sat, 31 Mar 2012 07:41:04 +0200 (CEST)
Received: from [127.0.0.1] (153.88.115.8) by esessmw0237.eemea.ericsson.se (153.88.115.91) with Microsoft SMTP Server id 8.3.213.0; Sat, 31 Mar 2012 07:41:03 +0200
Message-ID: <4F7698EF.1090306@ericsson.com>
Date: Sat, 31 Mar 2012 07:41:03 +0200
From: Stefan Hakansson LK <stefan.lk.hakansson@ericsson.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120310 Thunderbird/11.0
MIME-Version: 1.0
To: rtcweb@ietf.org
References: <4F7575FB.8010201@ericsson.com> <4F762813.6040506@jesup.org> <4F76937B.9020901@ericsson.com>
In-Reply-To: <4F76937B.9020901@ericsson.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Brightmail-Tracker: AAAAAA==
Subject: Re: [rtcweb] Interaction between MediaStream API and signaling
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 31 Mar 2012 05:41:09 -0000

On 03/31/2012 07:17 AM, Stefan Hakansson LK wrote:
> On 03/30/2012 11:39 PM, Randell Jesup wrote:
>> On 3/30/2012 4:59 AM, Stefan Hakansson LK wrote:
>>> The JS API has deals with MediaStreams (this is what you send and
>>> receive using PeerConnection from an application perspective).
>>>
>>> A browser receiving RTP streams, needs side info to be able to
>>> assemble those RTP streams into MediaStreams in a correct way. The
>>> current model is that this is signaled using SDP exchanges (where
>>> Haralds MSID proposal would tell which MediaStream an RTP stream
>>> belongs to).
>>>
>>> As I brought up at the mike yesterday, I think we may have a race
>>> condition for the responder.
>>>
>>> For the initiator side browser, this is clear: once an (PR-)ANSWER is
>>> received, the responder has received the SDP, and hence can map
>>> incoming RTP streams into MediaStreams.
>>>
>>> But for the responder side this is less clear to me. Imagine
>>> applications where the responder just mirrors the initiator - if one
>>> of the parties adds a MediaStream to PeerConnection, the other end
>>> would add the corresponding MediaStream.
>>>
>>> This can happen any time in the session, so ICE can very well be up
>>> and running. One example could be that the data channel is used for
>>> text chat, when one side clicks a button to start video. And the
>>> application can have asked for permission to use all input devices
>>> earlier, so no user interaction may be involved.
>>>
>>> In this situation the responder's (added) RTP streams can very well
>>> arrive before the ANSWER if I understand correctly.
>>
>> Yes.  Just like in SIP.  And so when you send an OFFER (or modified
>> re-OFFER), you must be ready to receive data per that offer even if no
>> ANSWER has been received - just like in SIP.  And if its a re-offer, you
>> need to accept the old, and accept the new (though you could probably
>> use reception of obviously new-OFFER media to turn off
>> decoding/rendering old-OFFER in preparation for the ANSWER).
>>
>> The flip side of this is the responder has to infer when the sender
>> switches over to the result of the ANSWER from the media.  For example:
>>
>> A                                      B
>> <--- H.261 --->
>> re-OFFER(VP8) --->
>> <-- ANSWER(VP8) (delayed in reception)
>> <-----------VP8            (A should infer that B ANSWERed and accepted VP8)
>>     ---------->   H.261
>> <-- ANSWER(VP8) (received)
>> <--------VP8---------->   (B should infer by reception of VP8 that ANSWER
>> was received)
>>
>> (Personally, I hate inferences, but without a 3 (or 4) way handshake,
>> you have to).  If you switches of codecs are staged, then this isn't
>> (much) of a problem.  Either leave old codec on the list, or leave it on
>> the list until accept, and then re-OFFER to remove the un-used codec.
>
> I think I understand what you mean, and this would work fine as long as
> you just switch codecs that are used in already set-up MediaStreams.
>
> But if A in this case, as part of re-OFFERING the session, not only
> offers a new codec (VP8) for the already flowing video but also adds a
> new outgoing video stream (e.g. front cam), and then (without receiving
> the ANSWER - delayed in reception) starts receiving VP8 video it could
> not really know if this VP8 video is new video from the responders front
> cam or just a new codec for the existing (back cam) video from the
> responder to the sender.

This may have been a very bad example. Probably you can tell them apart 
on the SSRC. But even so, the A browser won't know what the VP8 stream 
(if it has an unknown SSRC) represents without receiving the ANSWER.

>
>>
>> One problem is what to do in the switchover window when you might get a
>> mixture of old and new media, especially if you moved them to different
>> ports and so can't count on RTP sequence re-ordering to un-mix them; in
>> the past I dealt with that (and long codec-switch times) by locking out
>> codec changes for a fraction of a second after I do one.  Not a huge
>> deal, however.
>>
>> My apologies if I've missed something in JSEP; I've been heads-down
>> enough in Data Channels and bring-up that I could have a disconnect here
>> and be saying something silly.
>
> Actually I don't think this is very JSEP related; it is the generic
> problem that the browser receiving RTP streams need some side info about
> them before being able to do anything sensible with them.
>
>>
>>> I think we need to find a way to handle this. One way is to add an
>>> "ACK" that indicates to the responder that the initiator has received
>>> the ANSWER, but I'm not sure that is the best way.
>>
>> If you need to know that, you need a SIP-style ACK.
>
> As explained, I do think we need to know that.
>
>>
>
> _______________________________________________
> rtcweb mailing list
> rtcweb@ietf.org
> https://www.ietf.org/mailman/listinfo/rtcweb