Re: [rtcweb] Interaction between MediaStream API and signaling

Stefan Hakansson LK <stefan.lk.hakansson@ericsson.com> Mon, 02 April 2012 13:55 UTC

Message-ID: <4F79AFD1.8030401@ericsson.com>
Date: Mon, 02 Apr 2012 15:55:29 +0200
From: Stefan Hakansson LK <stefan.lk.hakansson@ericsson.com>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:11.0) Gecko/20120310 Thunderbird/11.0
MIME-Version: 1.0
To: Justin Uberti <juberti@google.com>
References: <4F7575FB.8010201@ericsson.com> <4F762813.6040506@jesup.org> <4F76937B.9020901@ericsson.com> <4F7698EF.1090306@ericsson.com> <CAOJ7v-0TKsJ9kF73w357GXEGfyNeheZb5Unfqm0hf_PN6tmOQw@mail.gmail.com>
In-Reply-To: <CAOJ7v-0TKsJ9kF73w357GXEGfyNeheZb5Unfqm0hf_PN6tmOQw@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Cc: "rtcweb@ietf.org" <rtcweb@ietf.org>
Subject: Re: [rtcweb] Interaction between MediaStream API and signaling
Precedence: list

On 04/02/2012 05:16 AM, Justin Uberti wrote:
>
>
> On Sat, Mar 31, 2012 at 1:41 AM, Stefan Hakansson LK
> <stefan.lk.hakansson@ericsson.com
> <mailto:stefan.lk.hakansson@ericsson.com>> wrote:
>
>     On 03/31/2012 07:17 AM, Stefan Hakansson LK wrote:
>
>         On 03/30/2012 11:39 PM, Randell Jesup wrote:
>
>             On 3/30/2012 4:59 AM, Stefan Hakansson LK wrote:
>
>                 The JS API has deals with MediaStreams (this is what you
>                 send and
>                 receive using PeerConnection from an application
>                 perspective).
>
>                 A browser receiving RTP streams, needs side info to be
>                 able to
>                 assemble those RTP streams into MediaStreams in a
>                 correct way. The
>                 current model is that this is signaled using SDP
>                 exchanges (where
>                 Haralds MSID proposal would tell which MediaStream an
>                 RTP stream
>                 belongs to).
>
>                 As I brought up at the mike yesterday, I think we may
>                 have a race
>                 condition for the responder.
>
>                 For the initiator side browser, this is clear: once an
>                 (PR-)ANSWER is
>                 received, the responder has received the SDP, and hence
>                 can map
>                 incoming RTP streams into MediaStreams.
>
>                 But for the responder side this is less clear to me. Imagine
>                 applications where the responder just mirrors the
>                 initiator - if one
>                 of the parties adds a MediaStream to PeerConnection, the
>                 other end
>                 would add the corresponding MediaStream.
>
>                 This can happen any time in the session, so ICE can very
>                 well be up
>                 and running. One example could be that the data channel
>                 is used for
>                 text chat, when one side clicks a button to start video.
>                 And the
>                 application can have asked for permission to use all
>                 input devices
>                 earlier, so no user interaction may be involved.
>
>                 In this situation the responder's (added) RTP streams
>                 can very well
>                 arrive before the ANSWER if I understand correctly.
>
>
>             Yes.  Just like in SIP.  And so when you send an OFFER (or
>             modified
>             re-OFFER), you must be ready to receive data per that offer
>             even if no
>             ANSWER has been received - just like in SIP.  And if its a
>             re-offer, you
>             need to accept the old, and accept the new (though you could
>             probably
>             use reception of obviously new-OFFER media to turn off
>             decoding/rendering old-OFFER in preparation for the ANSWER).
>
>             The flip side of this is the responder has to infer when the
>             sender
>             switches over to the result of the ANSWER from the media.
>               For example:
>
>             A                                      B
>             <--- H.261 --->
>             re-OFFER(VP8) --->
>             <-- ANSWER(VP8) (delayed in reception)
>             <-----------VP8            (A should infer that B ANSWERed
>             and accepted VP8)
>                 ---------->   H.261
>             <-- ANSWER(VP8) (received)
>             <--------VP8---------->   (B should infer by reception of
>             VP8 that ANSWER
>             was received)
>
>             (Personally, I hate inferences, but without a 3 (or 4) way
>             handshake,
>             you have to).  If you switches of codecs are staged, then
>             this isn't
>             (much) of a problem.  Either leave old codec on the list, or
>             leave it on
>             the list until accept, and then re-OFFER to remove the
>             un-used codec.
>
>
>         I think I understand what you mean, and this would work fine as
>         long as
>         you just switch codecs that are used in already set-up MediaStreams.
>
>         But if A in this case, as part of re-OFFERING the session, not only
>         offers a new codec (VP8) for the already flowing video but also
>         adds a
>         new outgoing video stream (e.g. front cam), and then (without
>         receiving
>         the ANSWER - delayed in reception) starts receiving VP8 video it
>         could
>         not really know if this VP8 video is new video from the
>         responders front
>         cam or just a new codec for the existing (back cam) video from the
>         responder to the sender.
>
>
>     This may have been a very bad example. Probably you can tell them
>     apart on the SSRC. But even so, the A browser won't know what the
>     VP8 stream (if it has an unknown SSRC) represents without receiving
>     the ANSWER.
>
>
> I think this is only an issue if you decide to add streams in the
> ANSWER. But even so, eventually the ANSWER arrives and you can start
> demuxing/decoding appropriately.

Yes, I agree to that it is only an issue if you add streams in the 
answer. Perhaps it is a model we should move away from - but that is the 
model used in the basic examples of the JSEP draft.

I think there is a risk of clipping if you start sending immediately, 
but can only start demuxing/decoding once an answer is received.

>
> Regardless, if the app wants to require some sort of ACK message to
> begin transmission, or perhaps require that the remote side ask for the
> media streams it wants to be sent, I think this could be implemented in
> the app-specific signaling layer; the streams could initially be added
> as inactive, and only changed to sendrecv when the ACK arrives.

To me it is the browser that would need the info to be able give 
something sensible to the app for playout - not the app itself.

>
>
>
>
>
>             One problem is what to do in the switchover window when you
>             might get a
>             mixture of old and new media, especially if you moved them
>             to different
>             ports and so can't count on RTP sequence re-ordering to
>             un-mix them; in
>             the past I dealt with that (and long codec-switch times) by
>             locking out
>             codec changes for a fraction of a second after I do one.
>               Not a huge
>             deal, however.
>
>             My apologies if I've missed something in JSEP; I've been
>             heads-down
>             enough in Data Channels and bring-up that I could have a
>             disconnect here
>             and be saying something silly.
>
>
>         Actually I don't think this is very JSEP related; it is the generic
>         problem that the browser receiving RTP streams need some side
>         info about
>         them before being able to do anything sensible with them.
>
>
>                 I think we need to find a way to handle this. One way is
>                 to add an
>                 "ACK" that indicates to the responder that the initiator
>                 has received
>                 the ANSWER, but I'm not sure that is the best way.
>
>
>             If you need to know that, you need a SIP-style ACK.
>
>
>         As explained, I do think we need to know that.
>
>
>
>         _________________________________________________
>         rtcweb mailing list
>         rtcweb@ietf.org <mailto:rtcweb@ietf.org>
>         https://www.ietf.org/mailman/__listinfo/rtcweb
>         <https://www.ietf.org/mailman/listinfo/rtcweb>
>
>
>     _________________________________________________
>     rtcweb mailing list
>     rtcweb@ietf.org <mailto:rtcweb@ietf.org>
>     https://www.ietf.org/mailman/__listinfo/rtcweb
>     <https://www.ietf.org/mailman/listinfo/rtcweb>
>
>

[rtcweb] Interaction between MediaStream API and … Stefan Hakansson LK
Re: [rtcweb] Interaction between MediaStream API … Randell Jesup
Re: [rtcweb] Interaction between MediaStream API … Stefan Hakansson LK
Re: [rtcweb] Interaction between MediaStream API … Stefan Hakansson LK
Re: [rtcweb] Interaction between MediaStream API … Stefan Hakansson LK
Re: [rtcweb] Interaction between MediaStream API … Justin Uberti
Re: [rtcweb] Interaction between MediaStream API … Stefan Hakansson LK
Re: [rtcweb] Interaction between MediaStream API … Randell Jesup
Re: [rtcweb] Interaction between MediaStream API … Stefan Hakansson LK