[rtcweb] SDP offer/answer vs. JSON (was: About defining a signaling protocol for WebRTC (or not))

Hadriel Kaplan <HKaplan@acmepacket.com> Fri, 16 September 2011 18:26 UTC

Return-Path: <HKaplan@acmepacket.com>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost []) by ietfa.amsl.com (Postfix) with ESMTP id E99E621F84D1 for <rtcweb@ietfa.amsl.com>; Fri, 16 Sep 2011 11:26:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.507
X-Spam-Status: No, score=-2.507 tagged_above=-999 required=5 tests=[AWL=0.092, BAYES_00=-2.599]
Received: from mail.ietf.org ([]) by localhost (ietfa.amsl.com []) (amavisd-new, port 10024) with ESMTP id 0dYk0YDT18XS for <rtcweb@ietfa.amsl.com>; Fri, 16 Sep 2011 11:26:47 -0700 (PDT)
Received: from etmail.acmepacket.com (etmail.acmepacket.com []) by ietfa.amsl.com (Postfix) with ESMTP id F07A221F841A for <rtcweb@ietf.org>; Fri, 16 Sep 2011 11:26:46 -0700 (PDT)
Received: from MAIL2.acmepacket.com ( by etmail.acmepacket.com ( with Microsoft SMTP Server (TLS) id; Fri, 16 Sep 2011 14:29:01 -0400
Received: from MAIL1.acmepacket.com ([]) by Mail2.acmepacket.com ([]) with mapi id 14.01.0270.001; Fri, 16 Sep 2011 14:29:01 -0400
From: Hadriel Kaplan <HKaplan@acmepacket.com>
To: Harald Alvestrand <harald@alvestrand.no>
Thread-Topic: SDP offer/answer vs. JSON (was: About defining a signaling protocol for WebRTC (or not))
Thread-Index: AQHMdJ6ASfE9VHVwDUuGxXm7AxKbwQ==
Date: Fri, 16 Sep 2011 18:29:00 +0000
Message-ID: <F0A2E045-68FC-4DC0-A0E8-BF29E7690FAF@acmepacket.com>
References: <CALiegfnOCxyTo9ffQ272+ncdu5UdgrtDT-dn10BWGTZMEjZoCg@mail.gmail.com> <4E71927C.1090606@skype.net> <CALiegfnEaYVsZpKQOoVtT=2gGCzssX79pxLGo7H2Ez0GcMTG-A@mail.gmail.com> <0BF9ED5E-5B73-4316-AE95-0D85B73CBD19@phonefromhere.com> <CALiegfnE9G_vxXDha7pb57pmd=rovLOz=-uWTOirSPDV-pLyMg@mail.gmail.com> <20110915140248.4cc17977@lminiero-acer> <4E71F90D.8030302@alvestrand.no>
In-Reply-To: <4E71F90D.8030302@alvestrand.no>
Accept-Language: en-US
Content-Language: en-US
x-originating-ip: []
Content-Type: text/plain; charset="iso-8859-1"
Content-ID: <936B9D931B4B4343873754E972E12572@acmepacket.com>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Brightmail-Tracker: AAAAAQAAAWE=
Cc: "<rtcweb@ietf.org>" <rtcweb@ietf.org>
Subject: [rtcweb] SDP offer/answer vs. JSON (was: About defining a signaling protocol for WebRTC (or not))
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 16 Sep 2011 18:26:48 -0000

On Sep 15, 2011, at 9:09 AM, Harald Alvestrand wrote:

> The disadvantage of parsing to another structure (I am fond of JSON myself) is that one has to maintain a data definition for the format being parsed to, a defined transform between that and the "canonical SDP structure" has to be implemented in user space when one does SDP interoperability, both of those have to be updated for every SDP extension that someone defines somewhere, and one is still not free to define extensions on the non-SDP side if one still requires the ability to map them into SDP.
> If one uses the "native" SDP format, which is the format in which every extension to the format gets documented, the browsers are the ones who *have* to parse it (although others are likely to).

Right so the above paragraphs get to the heart of the matter, methinks.  Ultimately we need W3C to define an API, and the API has to provide a means of learning RTP/media info from the browser and commanding the browser to perform certain things with RTP/media.  One could expose this API as a true data structure, or as a long string of tokens to be parsed/serialized back/forth.  If the latter, then the choices are basically JSON or SDP.  And SDP seems advantageous because it appears to be the least work for the simple use-cases, because the javascript could just copy back/forth the SDP between the browser and server.  In other words you're optimizing for the very simple use-cases, in exchange for making it more complicated for the advanced use-cases.  Right?

OK, that's a laudable goal.  And I recognize that the decision has basically already been made, and nothing's going to change it. 

But email's free... so for the sake of posterity (and email archiving) here're some reasons not to use SDP anyway:
1) Incorporating SDP and the offer/answer model into the Browser and W3C API inexorably ties the W3C to the IETF MMUSIC working group for all time.  So far, I had been going on the assumption the IETF would be defining what the RTP library had to do/expose, while W3C would define the API.  But if the API includes SDP offer/answer, that portion is the IETF's domain too, afaik.  Anything the W3C wants to do in the future for that has to go through the IETF, not just IANA. (right?)

2) This isn't just about JSON vs. SDP - it's about SDP *offer/answer*.  SDP offer/answer wasn't meant to be an API between an application and its RTP library - it's a *protocol* between applications.  One side-effect of this is it has historic state.  For example, if an SDP offer contains two media lines, and one media is removed, the number of SDP media lines don't reduce back to one - EVER.  So if PeerConnection.removeStream() is invoked, the Browser needs to remember there was that stream and signal it in SDP as disabled for all time, until PeerConnection is closed.  If addStream() is invoked later, it could/could-not re-use that same (disabled) media line, or add a new one.

As another example, if a new SDP offer is sent out in SIP and gets rejected with a 488, the session reverts to the previously agreed SDP state.  The Browser would therefore have to keep state of previous SDP and revert to it to handle this case.  For example, if my Javascript started with only an audio MediaStream in PeerConnection and later added a video MediaStream to it, the new SDP offer would contain two media lines - if the offer gets rejected with 488, how is that communicated to the Browser and what will the browser do?

3) You might well want information conveyed across that "API" that is not meant to be sent on the wire in SDP - things you don't want defined by IANA as SDP tokens.  For example, you may want to provide packet counts, jitter, latency, and other meta-information about individual RTP codecs.  Using JSON allows you to have data member variables which will not get serialized into SDP, but are purely for the javascript's use, while still within the referential tree structure of the media stream info.  Or they may be for sending to peers, but simply not for SDP. (like you could send the jitter/latency info through the signaling channel)

4) Obviously if the application as a whole needs to do SDP offer/answer, then *someone* will have to implement it correctly, including the state-related stuff.  It could be the browser or the javascript that do this.  Chrome may do a perfect job of that in the browser, afaik.  But there are other browser vendors, including niche ones such as Dolphin and Skyfire.  What are the odds they all get it right the first time?

So which would you rather have updating an SDP engine, if one is even needed... or updating "every SDP extension that someone defines somewhere": the javascript which is written by the developer that knows what they want when they want it, and can update their code by updating their javascript (or not if they don't need to); or the browsers which are written by companies not under the javascript developer's control, at a time of the browser companies' choosing? 

Obviously for some things the browser will have to be updated regardless, for example to understand rather than just ignore new JSON entries, to provide new codecs, etc.  But not all new SDP attributes require changes in the media plane, nor encoding into JSON.  In fact, a lot doesn't - some of it's higher-application info, not really for the RTP library, and more of it's coming in the future.