Re: [rtcweb] Additional requirement - audio-only communication

Stefan Håkansson LK <stefan.lk.hakansson@ericsson.com> Fri, 26 August 2011 14:22 UTC

Return-Path: <stefan.lk.hakansson@ericsson.com>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 00F6B21F86EA for <rtcweb@ietfa.amsl.com>; Fri, 26 Aug 2011 07:22:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.994
X-Spam-Level:
X-Spam-Status: No, score=-6.994 tagged_above=-999 required=5 tests=[AWL=0.705, BAYES_00=-2.599, GB_I_INVITATION=-2, J_CHICKENPOX_15=0.6, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OPSApdNY6mOA for <rtcweb@ietfa.amsl.com>; Fri, 26 Aug 2011 07:22:11 -0700 (PDT)
Received: from mailgw10.se.ericsson.net (mailgw10.se.ericsson.net [193.180.251.61]) by ietfa.amsl.com (Postfix) with ESMTP id 9D32821F862F for <rtcweb@ietf.org>; Fri, 26 Aug 2011 07:22:10 -0700 (PDT)
X-AuditID: c1b4fb3d-b7c47ae000000b17-ec-4e57ac5d4cd1
Received: from esessmw0197.eemea.ericsson.se (Unknown_Domain [153.88.253.125]) by mailgw10.se.ericsson.net (Symantec Mail Security) with SMTP id 06.56.02839.D5CA75E4; Fri, 26 Aug 2011 16:23:25 +0200 (CEST)
Received: from [150.132.141.36] (153.88.115.8) by esessmw0197.eemea.ericsson.se (153.88.115.88) with Microsoft SMTP Server id 8.3.137.0; Fri, 26 Aug 2011 16:23:25 +0200
Message-ID: <4E57AC5C.1020406@ericsson.com>
Date: Fri, 26 Aug 2011 16:23:24 +0200
From: =?UTF-8?B?U3RlZmFuIEjDpWthbnNzb24gTEs=?= <stefan.lk.hakansson@ericsson.com>
User-Agent: Mozilla/5.0 (Windows NT 6.0; rv:6.0) Gecko/20110812 Thunderbird/6.0
MIME-Version: 1.0
To: Justin Uberti <juberti@google.com>
References: <4E539A1F.109@alvestrand.no> <4E53B80C.20304@ericsson.com> <4E53E0C8.6010304@alvestrand.no> <BBF498F2D030E84AB1179E24D1AC41D61C1BCA7F62@ESESSCMS0362.eemea.ericsson.se> <4E54ADC7.8030407@jesup.org> <4E54CC05.1040705@alvestrand.no> <4E54CE29.2060605@ericsson.com> <4E54D867.4060706@alvestrand.no> <4E54D9C2.6000205@ericsson.com> <4E54DE8E.9080207@alvestrand.no> <4E54DFCF.4000805@ericsson.com> <4E54E24F.7060906@alvestrand.no> <4E56707B.104@skype.net> <4E567737.6020101@jesup.org> <4E5680B0.6070702@skype.net> <CAOJ7v-0DnVKYD2gd4os3R-LuzN1mu4qZqjndDEJcJXwY7U-FnA@mail.gmail.com> <4E56E264.8000600@mozilla.com> <CAOJ7v-2kEByX3mq7dRFuo7wEqaEGz597aWuFpEZK9zE_eS6U4A@mail.gmail.com> <4E5747E7.2050605@ericsson.com> <CAOJ7v-1ccLPZhqCDW0ngFkm23TeDSLjNCMUJj-k7wwdg+tTnhg@mail.gmail.com>
In-Reply-To: <CAOJ7v-1ccLPZhqCDW0ngFkm23TeDSLjNCMUJj-k7wwdg+tTnhg@mail.gmail.com>
Content-Type: text/plain; charset="UTF-8"; format=flowed
Content-Transfer-Encoding: 8bit
X-Brightmail-Tracker: AAAAAA==
Cc: "rtcweb@ietf.org" <rtcweb@ietf.org>
Subject: Re: [rtcweb] Additional requirement - audio-only communication
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 26 Aug 2011 14:22:12 -0000

On 2011-08-26 16:11, Justin Uberti wrote:
>
>
> On Fri, Aug 26, 2011 at 3:14 AM, Stefan Håkansson LK
> <stefan.lk.hakansson@ericsson.com
> <mailto:stefan.lk.hakansson@ericsson.com>> wrote:
>
>     FWIW,
>
>     this is how I would do the case of the caller (initiator) wanting to
>     receive only audio, but willing to send audio and video with the
>     current API proposal:
>
>     1. Generate an audiovisual mediastream (getUserMedia)
>     2. create PeerConnection, add the above mediastream (addStream)
>     3. combine the SDP received from PeerConnection with the following
>     fields "This is an invitation to a communication", "sending audio
>     and video", "want to receive only audio" into a message sent to the
>     peer.
>     4. the app in the peer (same app!) receives the message, reads the
>     fields, (given user agreement to start the communication) creates a
>     PeerConnection, feeds the received SDP into the PC, generates an
>     *audio only* stream (getUserMedia) and adds it to the PC
>     5. A couple of SDP exchanges later the application is working as
>     intended. (there will be onaddstream's, you will attach those
>     streams to video/audio elements etc.)
>
>     Quite simple! And no need to have the application read, understand,
>     or modify, the SDPs - they are opaque. (Then, of course, the
>     mediastreams must be mapped to RTP sessions and such stuff, but I'm
>     sure that is solvable and should not be visible in the API IMHO.)
>
>
> Maybe I misunderstand, but it sounds like with these fields you are
> defining a new signaling protocol, which I think we want to avoid.

That's not how I see it. It is the same web app in both peers, and they 
can communicate inbetween them in any way the app developer see fit 
(using e.g. XHR via the web server). There is already a need for a 
mechanism to pass the SDPs between the browsers, and that mechanism 
could be used to pass other data.

So no new protocol needed!


>
> Going back to your step 3 - here the initiator can of course munge the
> generated SDP and discard the m=video section, which should work, but in
> other cases this munging may cause things to get out of sync since the
> initiator browser's understanding of the offer now differs from reality.

I would really like to avoid that the web app has to read and modify the 
SDP except for in exceptional cases.

In the current API there is also no need (at least in this case) since 
all streams are set up unidirectional - all that is needed is that the 
application in the browser of the callee gets to know it should add an 
audio-only stream (and it can be told that as I outlined above).

> What we really want is some way to either a) an API to tell the
> initiator browser to produce an offer with no m=video, or b) a way for
> the application to customize or generate the offer and feed it back to
> the browser. b) is definitely more complex, but a) could end up causing
> us to add a lot of knobs to the API.
>
>
>     Stefan
>
>
>
>
>
>
>     on 2011-08-26 06:48, Justin Uberti wrote:
>
>
>
>         On Thu, Aug 25, 2011 at 8:01 PM, Timothy B. Terriberry
>         <tterriberry@mozilla.com <mailto:tterriberry@mozilla.com>
>         <mailto:tterriberry@mozilla.__com
>         <mailto:tterriberry@mozilla.com>>> wrote:
>
>             Justin Uberti wrote:
>
>                 I think it makes sense for the browser to emit capabilities,
>                 which could
>
>
>             I agree there's clearly some gaps here that need to be filled.
>
>
>                 then be used by the web app to generate a SDP offer or
>         answer. This
>
>
>                 The original problem that started this email is one specific
>                 example -
>                 if the callee application wants to only receive audio, the
>                 application
>                 can generate an audio-only SDP based on the offer, the
>         browser
>
>
>             I think the Harald's original problem was the other way
>         around: the
>             _caller_ wants to only receive audio, and needs to generate
>         an SDP
>             _offer_ that says that, even if the browser is capable of
>         receiving
>             video. I don't think that invalidates your point, though.
>
>
>                 capabilities, and the desired app behavior - without any new
>                 APIs in the
>                 browser.
>
>
>             But I'm not sure what you mean by "without any new APIs"...
>         in your
>             approach, something has to be able to enumerate the
>         capabilities in
>             sufficient detail for the webapp to generate SDP by itself.
>         I don't
>             think there are any existing APIs that go that far.
>
>
>         I meant, without specific APIs for that specific use case (i.e.
>         "create
>         an audio-only offer"). We would need some sort of
>         GetCapabilities API
>         that returned a blob of all the session description options the
>         browser
>         supported, which probably could be formatted as a uber SDP offer
>         if that
>         made parsing simplest.
>
>
>             You also need an API to tell the browser what to actually
>         do. The
>             current PeerConnection approach is passing in the offer or the
>             answer. If you're generating the answer, you need some way
>         to tell
>             your browser what you answered. For the "please don't send
>         me video"
>             case this is not an issue... it'll simply never arrive. If
>         you want
>             to change what the local browser is sending out, however,
>         then it is.
>
>
>         Yes, you need a "HandleLocalDescription" and
>         "HandleRemoteDescription"
>         API, instead of just a single OnSignalingMessage API. The
>         deviation from
>         the current flow is fairly minor, you just have 2 additional
>         states in
>         the state machine.
>
>
>             I do agree it eliminates the need for an API to tell the browser
>             what kind of SDP to generate, but it also seems like it
>         imposes a
>             pretty big burden on application developers: even if you
>         keep the
>             currently-proposed PeerConnection ability to generate SDP as the
>         "simple API", the moment you want to do something slightly more
>             complex, you have to add code to generate the appropriate
>         SDP, which
>             necessarily involves figuring out all the capabilities of the
>             various browsers on various platforms. Maybe I'm naive in
>         thinking
>             that seems like an awful lot of work just to say, "Please
>         don't send
>             me video."
>
>
>         It's a fair point, there is more complexity involved in
>         generating the
>         offer, but in our experience it is manageable. I suppose it
>         depends on
>         how many APIs we decide we need to control the generation of SDP
>           (i.e.
>         use cases other than no-video). If we decide we need to control
>         crypto,
>         resolution preference, codec preference, we may find this
>         approach simpler.
>
>
>             ___________________________________________________
>             rtcweb mailing list
>         rtcweb@ietf.org <mailto:rtcweb@ietf.org> <mailto:rtcweb@ietf.org
>         <mailto:rtcweb@ietf.org>>
>
>         https://www.ietf.org/mailman/____listinfo/rtcweb
>         <https://www.ietf.org/mailman/__listinfo/rtcweb>
>         <https://www.ietf.org/mailman/__listinfo/rtcweb
>         <https://www.ietf.org/mailman/listinfo/rtcweb>>
>
>
>
>     _________________________________________________
>     rtcweb mailing list
>     rtcweb@ietf.org <mailto:rtcweb@ietf.org>
>     https://www.ietf.org/mailman/__listinfo/rtcweb
>     <https://www.ietf.org/mailman/listinfo/rtcweb>
>
>