Re: [rtcweb] Comments on draft-uberti-rtcweb-plan-00

Paul Kyzivat <pkyzivat@alum.mit.edu> Mon, 13 May 2013 14:21 UTC

Message-ID: <5190F6E7.3090801@alum.mit.edu>
Date: Mon, 13 May 2013 10:21:27 -0400
From: Paul Kyzivat <pkyzivat@alum.mit.edu>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:17.0) Gecko/20130328 Thunderbird/17.0.5
MIME-Version: 1.0
To: rtcweb@ietf.org
References: <BLU169-W1158BEB6CD5A0828D7D866293A60@phx.gbl>
In-Reply-To: <BLU169-W1158BEB6CD5A0828D7D866293A60@phx.gbl>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Subject: Re: [rtcweb] Comments on draft-uberti-rtcweb-plan-00
Precedence: list

Comments inline, both to Bernard and to the draft itself.

On 5/11/13 7:59 PM, Bernard Aboba wrote:

>       2.1
>
>           These layouts can change dynamically, depending on the conference
>           content and the preferences of the receiver.  As such, there are not
>           well-defined 'roles', that could be used to group sources into
>           specific 'large' or 'thumbnail' categories.  As such, the requirement
>           Plan B attempts to satisfy is support for sending and receiving up to
>           hundreds of simultaneous, hetereogeneous sources.
>
>       [BA] While I agree that the layouts can change dynamically, I am wondering if there is an implication that the burden of determining the 'roles' is on the mixer.  For example, it might be assumed that the mixer allocates an SSRC for the 'large' category, and other SSRCs for the 'thumbnails' and then these SSRCs are statically mapped to MSTs and rendered.  However, another way to handle it is for the browser to handle the role assignment, and I would argue that this could make more sense in some cases, particularly since this could make the mixer a lot simpler, or even obviate the need for a mixer entirely (e.g. an RTP translator might work in some cases).

ISTM that this particular part gets well into the CLUE design space, 
where the focus is on the describing spatial relationship among sources, 
and thus the ability to choose which ones are desired on that basis. In 
clue we have concluded that these relationships are complex to describe, 
and not suited to embedding in SDP.

> 4.1  <http://tools.ietf.org/html/draft-uberti-rtcweb-plan-00#section-4.1>.  Negotiation of new or legacy behavior
>
>     In order to know whether a given application supports Plan B, an
>     attribute in the offer is needed.  There are various options that
>     could be used for this:
>
>     o  a=ssrc isn't enough, since you might not have any send streams,
>        and therefore no a=ssrc attributes.
>
>     o  a=max-*-ssrc could work, but has additional semantics
>
>     o  a=msid-semantic indicates that you understand MSIDs.
>
>     Because understanding MSID is a prerequisite to using plan B, the
>     third option (presence of a=msid-semantic) is recommended.
>
> [BA]  I would suggest that max-*-ssrc is a better choice because there are legacy scenarios where msid might not be present.

I agree with you here. Instead of layering baggage on an attribute 
intended for a different purpose, using this one depends means depending 
directly on the intended purpose of the attribute.

>       4.2
>       <http://tools.ietf.org/html/draft-uberti-rtcweb-plan-00#section-4.2>.
>       New signaling flow
>
>     When both sides support Plan B, to properly allow both sides to
>     indicate which MSTs they have, and allow the remote side to select
>     the desired MSTs to receive, a 3-way handshake is needed (this is
>     just math; the offer can't select the answerer's MSTs until they know
>     about them).
>
> [BA] While I understand the argument for why you need two O/As for both
> sides to select the desired streams, I think that it's possible to design
> the exchange so that only one O/A is needed most of the time.  The key
> concept is for the offer to contain information on what the offerer is capable
> of receiving in addition to what it is capable of sending.  Yes, another
> O/A might be needed if it turns out that the Offerer wants something different
> than what the Answerer chose, but at least the first Answer is guaranteed to
> be acceptable to the Offerer.
>
> That is, I believe we should think of the second O/A as an optional exchange
> that hopefully won't be needed much of the time than as part of a "3-way" or
> "4-way" handshake that will execute every time.

IMO the 2-way handshake as always depended on an implicit commonality 
between caller and callee, and/or applications simple enough that it is 
feasible to encompass all possible options within the first offer.

So it works well when you are calling from a "phone" and it is likely 
that the thing you are calling is also a phone. But it doesn't work very 
well if you are a telepresence system calling another telepresence system.

Even in more complex cases you can probably get things to work in one 
O/A exchange if a large percentage of the things you may call have 
similar configurations.

But the general case is going to require multiple O/A exchanges, or else 
exchange of info via some other channel prior to the O/A exchange.

>     The expected flow for this would be for the caller to
>     send an offer with its sources, then the callee would send back an
>     answer with the sources it wants the caller to send, followed
>     immediately by an offer with the sources that the callee has
>     available to send.  Finally, the answerer will reply back with the
>     sources that it wants to request from the callee.  The entire
>     sequence can be done in 1.5 RTT.
>
> [BA] Why not add the info on what sources the callee has available to send to the first Answer? If the Offer also contains the maximum number of received SSRCs,
> the Offerer should prepare to receive that many SSRCs, and the Answer could include up to that many sources
> as enabled and start sending.    That way, if the sources sent are OK with the Offerer then we don't need another
> Offer/Answer exchange, because the Answerer has indicated what sources it wants from the ones the Offerer
> said it could send.

Certainly the offer can describe a bunch of sendonly streams. That is 
the easy part. The hard part is describing what you want to receive, 
without knowing what is available to be received.

Also, it is hard to describe the content of those sendonly streams 
sufficiently in the offer so that the answerer will know whether it 
wants them or not.

> This assumes that the Offerer can handle incoming RTP streams up to the maximum number of receive SSRCs before it receives the Answer which can explicitly declare the SSRCs.
>
>
>     In addition, since the
>     sources are known ahead of time by the recipient of said sources, it
>     is prepared to demux them by SSRC without any signaling/media race.

*How* are the sources known ahead of time by the recipient???

	Thanks,
	Paul

[rtcweb] Comments on draft-uberti-rtcweb-plan-00 Bernard Aboba
Re: [rtcweb] Comments on draft-uberti-rtcweb-plan… Paul Kyzivat
Re: [rtcweb] Comments on draft-uberti-rtcweb-plan… Bernard Aboba
Re: [rtcweb] Comments on draft-uberti-rtcweb-plan… Harald Alvestrand
Re: [rtcweb] Comments on draft-uberti-rtcweb-plan… Cullen Jennings (fluffy)