Re: [rtcweb] My Opinion: Why I think a negotiating protocol is a Good Thing

Cullen Jennings <> Fri, 21 October 2011 04:35 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id CA05911E8099 for <>; Thu, 20 Oct 2011 21:35:02 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -105.494
X-Spam-Status: No, score=-105.494 tagged_above=-999 required=5 tests=[AWL=-0.095, BAYES_00=-2.599, J_CHICKENPOX_14=0.6, J_CHICKENPOX_64=0.6, RCVD_IN_DNSWL_MED=-4, USER_IN_WHITELIST=-100]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id RzkX-rchF3rg for <>; Thu, 20 Oct 2011 21:35:01 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id C50D121F84DD for <>; Thu, 20 Oct 2011 21:35:01 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;;; l=5608; q=dns/txt; s=iport; t=1319171701; x=1320381301; h=subject:mime-version:from:in-reply-to:date:cc: content-transfer-encoding:message-id:references:to; bh=R13HG5iMKLu0tm/iTbBnLEFHshAfjXO7RcbvgRM4N1U=; b=Ym1V9v6Or5glmb7kzmKpM8/+FOJmJ3EKe/aMnJSWSJck4srDugfi0jeR CaqQAGever/0y3Ab6+liDaRJF3Hu/46tQcQj3GMeDng3r7wLjRxC59i1z jT8Clgb8Ouy6Y812IE649hNXpsgwOx9uesZ1hG/MlU+cty3kWC1ZL4i2B M=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: Av0EAEn2oE6rRDoG/2dsb2JhbABDqRyBBYFuAQEBAwESASc/EAsOCi5XBi4Hh16XRgGeLYdIYQSIA4t9hSqMTA
X-IronPort-AV: E=Sophos;i="4.69,383,1315180800"; d="scan'208";a="9294690"
Received: from ([]) by with ESMTP; 21 Oct 2011 04:35:01 +0000
Received: from [] ( []) by (8.14.3/8.14.3) with ESMTP id p9L4Z0nK009605; Fri, 21 Oct 2011 04:35:01 GMT
Mime-Version: 1.0 (Apple Message framework v1084)
Content-Type: text/plain; charset="us-ascii"
From: Cullen Jennings <>
In-Reply-To: <>
Date: Thu, 20 Oct 2011 22:35:00 -0600
Content-Transfer-Encoding: quoted-printable
Message-Id: <>
References: <> <> <> <> <> <> <>
To: Hadriel Kaplan <>
X-Mailer: Apple Mail (2.1084)
Cc: "<>" <>
Subject: Re: [rtcweb] My Opinion: Why I think a negotiating protocol is a Good Thing
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 21 Oct 2011 04:35:02 -0000

On Oct 19, 2011, at 9:13 PM, Hadriel Kaplan wrote:

> [breaking your email up into chunks, 'cause my response is getting long]
> On Oct 19, 2011, at 8:46 PM, Cullen Jennings wrote:
>> I agree SDP is not used as an RTP API. I looked at a few RTP API and they looked like "here is your compressed media and it's time stamp" and "send this compressed media with following time stamp to this IP, port. I don' think any one want that to be the level of API that gets exposed by the browser. For one thing, the performance issues would likely not work. 
> Yeah when I said it I was kinda thinking "well not really 'RTP' library, because many of those are basically pseudo-sockets", but I was thinking like a media library (libraries that tie the codecs and RTP and SRTP and ICE layers together and provide an API to an app above).
>> One question that comes up is when a browser adds support for a new CODEC, should the Java script code of a given web sight need to change to take advantage of it. If your answer to that is at least in some cases, No, then it means you need an API that allows the browser to do the negotiation of the codecs.
> I don't think you do.  At least for most "codecs", all a Browser needs to give/be-given is all the a=rtmp and a=fmtp lines, or maybe even only the portions after the PT numbers, as a list/table of strings and their payload-type numbers, per audio/video stream.  I know there are some other codec-specific attributes now and then (T.38 comes to mind, but it was arguably not your typical "codec" anyway)... but in general wouldn't rtpmap+fmtp contents do it?

I'm happy to ignore T.38 as a "special case" but new codecs, video codecs in particular, need to define new parameters. Lets say a given codec defines some new parameters that show up in some normal spot in SDP, I don't see how the JS app will be able to negotiation without understanding what the new parameters mean. Lets say some new VP9 video codec defines a new maxFluffyDepth and the browser reports that it support 666. The other other side offers a maxFlufflyDepth=Yes. What does the Javascript code select? It might be that the based on these you even use a different parameter, say fluffyDepth=-6ft , to specify the interoperable solution. 

> There are a lot of other attributes, of course, but not per new codec.  Right?
>> Today SDP (and the mapping of it in Jingle), are pretty much the only games in town for that sort of negotiation. You can argue if we should use SDP or invent something new. I'd love to hear your opinion on that. If we decide we are going to use SDP, I think you end up with a API at roughly the level of what is the W3C WEBRTC draft today and something around the level of protocol as described in ROAP. 
> Let's assume we have to use SDP - I don't actually think we do, but just for argument's sake.  That still doesn't mean we have to embed the offer/answer model in the Browser, or even use it anywhere at all for pure web-apps.  The offer/answer model has some very specific semantics, which are actually restrictive.  For example, the offer/answer model assumes a symmetric media-line model: if you send one audio and one video m-line, the answer can only have one audio and one video m-line.
> As an example of something that cannot be accomplished because of this: imagine a Web-application which allows the Browser to communicate with a TelePresence (TP) system.  TP systems have multiple cameras, screen displays, microphones, and speakers.  A PC-based Browser typically only has a single microphone and camera, but can display multiple video feeds separately and can render-mix the incoming audio streams.  Thus, a Browser to TP system would produce an asymmetric media stream model: multiple video streams from the TP system to the Browser, and one video stream from the Browser to the TP system, and the same for audio.  Each TP audio and video stream is an independent m-line RTP session and has unique attributes to indicate position (left/center/right), which a Browser could in theory display on the left/center/right of your browser window.  Doing that is currently not possible with SDP offer/answer; not only because the SDP attributes aren't yet defined, but because the offer/answer model assumes a symmetric number of media-lines (m= lines).  Clearly if and when SDP is changed to handle TelePresence cases, Browsers could be subsequently upgraded to handle it as well sometime after; but they wouldn't need to if the Browser hadn't been involved in SDP and offer/answer to begin with.

Hmm - this example sort leads to the direction I think about this. I imagine we both agree that eventually mmusic will define an SDP attribute to help position the video feeds. Clue will slow this down from happening but once clue figures out it need to specify some geometry, it will end taking that to  mmusic. Having a different solution defined in something the browsers are using (or in a protocol like TIP) will just complicate making it work in SDP in a way that continues to be possible to gateway to shat SDP does. It would be easier to just add it in one place and call it done. 

> The offer/answer model is an ok protocol between VoIP systems, but it's not a good *API* between an application and its library. 
> -hadriel