Re: [rtcweb] My Opinion: Why I think a negotiating protocol is a Good Thing

Hadriel Kaplan <> Thu, 20 October 2011 03:13 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 2D3F411E80D5 for <>; Wed, 19 Oct 2011 20:13:39 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.895
X-Spam-Status: No, score=-1.895 tagged_above=-999 required=5 tests=[AWL=-0.496, BAYES_00=-2.599, J_CHICKENPOX_14=0.6, J_CHICKENPOX_64=0.6]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 0nhesWLeJVoq for <>; Wed, 19 Oct 2011 20:13:38 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 2ECB511E80B2 for <>; Wed, 19 Oct 2011 20:13:38 -0700 (PDT)
Received: from ( by ( with Microsoft SMTP Server (TLS) id; Wed, 19 Oct 2011 23:13:36 -0400
Received: from ([]) by ([]) with mapi id 14.01.0270.001; Wed, 19 Oct 2011 23:13:37 -0400
From: Hadriel Kaplan <>
To: Cullen Jennings <>
Thread-Topic: [rtcweb] My Opinion: Why I think a negotiating protocol is a Good Thing
Thread-Index: AQHMjtZB6kdbKMH90Eeuh1H9I7A1XA==
Date: Thu, 20 Oct 2011 03:13:35 +0000
Message-ID: <>
References: <> <> <> <> <> <>
In-Reply-To: <>
Accept-Language: en-US
Content-Language: en-US
x-originating-ip: []
Content-Type: text/plain; charset="us-ascii"
Content-ID: <>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Brightmail-Tracker: AAAAAQAAAWE=
Cc: "<>" <>
Subject: Re: [rtcweb] My Opinion: Why I think a negotiating protocol is a Good Thing
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Thu, 20 Oct 2011 03:13:39 -0000

[breaking your email up into chunks, 'cause my response is getting long]

On Oct 19, 2011, at 8:46 PM, Cullen Jennings wrote:

> I agree SDP is not used as an RTP API. I looked at a few RTP API and they looked like "here is your compressed media and it's time stamp" and "send this compressed media with following time stamp to this IP, port. I don' think any one want that to be the level of API that gets exposed by the browser. For one thing, the performance issues would likely not work. 

Yeah when I said it I was kinda thinking "well not really 'RTP' library, because many of those are basically pseudo-sockets", but I was thinking like a media library (libraries that tie the codecs and RTP and SRTP and ICE layers together and provide an API to an app above).

> One question that comes up is when a browser adds support for a new CODEC, should the Java script code of a given web sight need to change to take advantage of it. If your answer to that is at least in some cases, No, then it means you need an API that allows the browser to do the negotiation of the codecs.

I don't think you do.  At least for most "codecs", all a Browser needs to give/be-given is all the a=rtmp and a=fmtp lines, or maybe even only the portions after the PT numbers, as a list/table of strings and their payload-type numbers, per audio/video stream.  I know there are some other codec-specific attributes now and then (T.38 comes to mind, but it was arguably not your typical "codec" anyway)... but in general wouldn't rtpmap+fmtp contents do it?

There are a lot of other attributes, of course, but not per new codec.  Right?

> Today SDP (and the mapping of it in Jingle), are pretty much the only games in town for that sort of negotiation. You can argue if we should use SDP or invent something new. I'd love to hear your opinion on that. If we decide we are going to use SDP, I think you end up with a API at roughly the level of what is the W3C WEBRTC draft today and something around the level of protocol as described in ROAP. 

Let's assume we have to use SDP - I don't actually think we do, but just for argument's sake.  That still doesn't mean we have to embed the offer/answer model in the Browser, or even use it anywhere at all for pure web-apps.  The offer/answer model has some very specific semantics, which are actually restrictive.  For example, the offer/answer model assumes a symmetric media-line model: if you send one audio and one video m-line, the answer can only have one audio and one video m-line.

As an example of something that cannot be accomplished because of this: imagine a Web-application which allows the Browser to communicate with a TelePresence (TP) system.  TP systems have multiple cameras, screen displays, microphones, and speakers.  A PC-based Browser typically only has a single microphone and camera, but can display multiple video feeds separately and can render-mix the incoming audio streams.  Thus, a Browser to TP system would produce an asymmetric media stream model: multiple video streams from the TP system to the Browser, and one video stream from the Browser to the TP system, and the same for audio.  Each TP audio and video stream is an independent m-line RTP session and has unique attributes to indicate position (left/center/right), which a Browser could in theory display on the left/center/right of your browser window.  Doing that is currently not possible with SDP offer/answer; not only because the SDP attributes aren't yet defined, but because the offer/answer model assumes a symmetric number of media-lines (m= lines).  Clearly if and when SDP is changed to handle TelePresence cases, Browsers could be subsequently upgraded to handle it as well sometime after; but they wouldn't need to if the Browser hadn't been involved in SDP and offer/answer to begin with.

The offer/answer model is an ok protocol between VoIP systems, but it's not a good *API* between an application and its library.