Re: [rtcweb] Interaction between MediaStream API and signaling

Stefan Hakansson LK <stefan.lk.hakansson@ericsson.com> Sat, 31 March 2012 05:17 UTC

Return-Path: <stefan.lk.hakansson@ericsson.com>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5A49921F8731 for <rtcweb@ietfa.amsl.com>; Fri, 30 Mar 2012 22:17:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.807
X-Spam-Level:
X-Spam-Status: No, score=-9.807 tagged_above=-999 required=5 tests=[AWL=0.792, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CekcsJ1nNVhO for <rtcweb@ietfa.amsl.com>; Fri, 30 Mar 2012 22:17:50 -0700 (PDT)
Received: from mailgw10.se.ericsson.net (mailgw10.se.ericsson.net [193.180.251.61]) by ietfa.amsl.com (Postfix) with ESMTP id 254AF21F8730 for <rtcweb@ietf.org>; Fri, 30 Mar 2012 22:17:49 -0700 (PDT)
X-AuditID: c1b4fb3d-b7c4fae00000507f-e7-4f76937cbe29
Received: from esessmw0237.eemea.ericsson.se (Unknown_Domain [153.88.253.124]) (using TLS with cipher AES128-SHA (AES128-SHA/128 bits)) (Client did not present a certificate) by mailgw10.se.ericsson.net (Symantec Mail Security) with SMTP id CF.49.20607.C73967F4; Sat, 31 Mar 2012 07:17:48 +0200 (CEST)
Received: from [127.0.0.1] (153.88.115.8) by esessmw0237.eemea.ericsson.se (153.88.115.91) with Microsoft SMTP Server id 8.3.213.0; Sat, 31 Mar 2012 07:17:47 +0200
Message-ID: <4F76937B.9020901@ericsson.com>
Date: Sat, 31 Mar 2012 07:17:47 +0200
From: Stefan Hakansson LK <stefan.lk.hakansson@ericsson.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120310 Thunderbird/11.0
MIME-Version: 1.0
To: rtcweb@ietf.org
References: <4F7575FB.8010201@ericsson.com> <4F762813.6040506@jesup.org>
In-Reply-To: <4F762813.6040506@jesup.org>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Brightmail-Tracker: AAAAAA==
Subject: Re: [rtcweb] Interaction between MediaStream API and signaling
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 31 Mar 2012 05:17:51 -0000

On 03/30/2012 11:39 PM, Randell Jesup wrote:
> On 3/30/2012 4:59 AM, Stefan Hakansson LK wrote:
>> The JS API has deals with MediaStreams (this is what you send and
>> receive using PeerConnection from an application perspective).
>>
>> A browser receiving RTP streams, needs side info to be able to
>> assemble those RTP streams into MediaStreams in a correct way. The
>> current model is that this is signaled using SDP exchanges (where
>> Haralds MSID proposal would tell which MediaStream an RTP stream
>> belongs to).
>>
>> As I brought up at the mike yesterday, I think we may have a race
>> condition for the responder.
>>
>> For the initiator side browser, this is clear: once an (PR-)ANSWER is
>> received, the responder has received the SDP, and hence can map
>> incoming RTP streams into MediaStreams.
>>
>> But for the responder side this is less clear to me. Imagine
>> applications where the responder just mirrors the initiator - if one
>> of the parties adds a MediaStream to PeerConnection, the other end
>> would add the corresponding MediaStream.
>>
>> This can happen any time in the session, so ICE can very well be up
>> and running. One example could be that the data channel is used for
>> text chat, when one side clicks a button to start video. And the
>> application can have asked for permission to use all input devices
>> earlier, so no user interaction may be involved.
>>
>> In this situation the responder's (added) RTP streams can very well
>> arrive before the ANSWER if I understand correctly.
>
> Yes.  Just like in SIP.  And so when you send an OFFER (or modified
> re-OFFER), you must be ready to receive data per that offer even if no
> ANSWER has been received - just like in SIP.  And if its a re-offer, you
> need to accept the old, and accept the new (though you could probably
> use reception of obviously new-OFFER media to turn off
> decoding/rendering old-OFFER in preparation for the ANSWER).
>
> The flip side of this is the responder has to infer when the sender
> switches over to the result of the ANSWER from the media.  For example:
>
> A                                      B
> <--- H.261 --->
> re-OFFER(VP8) --->
> <-- ANSWER(VP8) (delayed in reception)
> <-----------VP8            (A should infer that B ANSWERed and accepted VP8)
>    ---------->  H.261
> <-- ANSWER(VP8) (received)
> <--------VP8---------->  (B should infer by reception of VP8 that ANSWER
> was received)
>
> (Personally, I hate inferences, but without a 3 (or 4) way handshake,
> you have to).  If you switches of codecs are staged, then this isn't
> (much) of a problem.  Either leave old codec on the list, or leave it on
> the list until accept, and then re-OFFER to remove the un-used codec.

I think I understand what you mean, and this would work fine as long as 
you just switch codecs that are used in already set-up MediaStreams.

But if A in this case, as part of re-OFFERING the session, not only 
offers a new codec (VP8) for the already flowing video but also adds a 
new outgoing video stream (e.g. front cam), and then (without receiving 
the ANSWER - delayed in reception) starts receiving VP8 video it could 
not really know if this VP8 video is new video from the responders front 
cam or just a new codec for the existing (back cam) video from the 
responder to the sender.

>
> One problem is what to do in the switchover window when you might get a
> mixture of old and new media, especially if you moved them to different
> ports and so can't count on RTP sequence re-ordering to un-mix them; in
> the past I dealt with that (and long codec-switch times) by locking out
> codec changes for a fraction of a second after I do one.  Not a huge
> deal, however.
>
> My apologies if I've missed something in JSEP; I've been heads-down
> enough in Data Channels and bring-up that I could have a disconnect here
> and be saying something silly.

Actually I don't think this is very JSEP related; it is the generic 
problem that the browser receiving RTP streams need some side info about 
them before being able to do anything sensible with them.

>
>> I think we need to find a way to handle this. One way is to add an
>> "ACK" that indicates to the responder that the initiator has received
>> the ANSWER, but I'm not sure that is the best way.
>
> If you need to know that, you need a SIP-style ACK.

As explained, I do think we need to know that.

>