[rtcweb] About defining a signaling protocol for WebRTC (or not)

Iñaki Baz Castillo <ibc@aliax.net> Wed, 14 September 2011 14:54 UTC

Return-Path: <ibc@aliax.net>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 38B2B21F88A0 for <rtcweb@ietfa.amsl.com>; Wed, 14 Sep 2011 07:54:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.649
X-Spam-Level:
X-Spam-Status: No, score=-2.649 tagged_above=-999 required=5 tests=[AWL=0.028, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cKXnilKcoIrv for <rtcweb@ietfa.amsl.com>; Wed, 14 Sep 2011 07:54:49 -0700 (PDT)
Received: from mail-qw0-f46.google.com (mail-qw0-f46.google.com [209.85.216.46]) by ietfa.amsl.com (Postfix) with ESMTP id 7C41A21F888A for <rtcweb@ietf.org>; Wed, 14 Sep 2011 07:54:49 -0700 (PDT)
Received: by qwj8 with SMTP id 8so170744qwj.19 for <rtcweb@ietf.org>; Wed, 14 Sep 2011 07:56:57 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.229.67.105 with SMTP id q41mr2959838qci.216.1316012217191; Wed, 14 Sep 2011 07:56:57 -0700 (PDT)
Received: by 10.229.79.207 with HTTP; Wed, 14 Sep 2011 07:56:57 -0700 (PDT)
Date: Wed, 14 Sep 2011 16:56:57 +0200
Message-ID: <CALiegfnOCxyTo9ffQ272+ncdu5UdgrtDT-dn10BWGTZMEjZoCg@mail.gmail.com>
From: Iñaki Baz Castillo <ibc@aliax.net>
To: rtcweb@ietf.org
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Subject: [rtcweb] About defining a signaling protocol for WebRTC (or not)
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 14 Sep 2011 14:54:50 -0000

Hi all,

There are some threads about the need (or not) for a well defined
signaling protocol within WebRTC. I would like to comment about it.

WebRTC defines multimedia capabilities for web-browsers and mandates
protocols as RTP, STUN, ICE, and understanding of SDP (RFC 4566). The
aim of these protocols is to enable multimedia streams between a
web-browser and other endpoint (which could also be a web-browser).

But having the above is not enough since a signaling
protocol/mechanism for managing the media sessions is required (for
requesting a multimedia session to the endpoint, for terminating it,
for putting it in hold...).

Both SIP and XMPP (with Jingle) behave as a signaling protocol and
manage multimedia sessions based on SDP descriptions (SIP uses plain
SDP grammar as defined in RFC 4566 while XMPP uses a XML version of
the SDP format). So both SIP and XMPP could be a good choice. But also
any custom signaling protocol carrying like-SDP information.

If WebRTC mandates a specific signaling protocol then all the web
providers should incorporate such a protocol within their
infrastructure, which seems not feasible for me (let's say web pages
served by hosting datacenters which just provide an Apache server for
the web developer, for example).

So I wonder: why is a specific signaling protocol needed at all? AFAIK
the only we need is an API (within WebRTC) to manage multimedia
sessions (start it, terminate it, use codec XXXX, put on hold...). How
the client application (let's assume the JavaScript code) obtains such
information should be out of the scope of WebRTC. The client
application (JavaScript) just needs to retrieve (via HTTP, WebSocket
or whatever) the "SDP" information provided by the endpoint and use
such data for making API calls to the WebRTC stack by passing as
arguments the remote peer IP, port, type of session, codec to use, and
so on.

For example, if a web page makes usage of SIP over WebSocket or XMPP
over WebSocket, the signaling (also containing SDP information) would
be carried within SIP or XMPP messages. The only reqiremente would be
for the WebSocket server to be integrated within a SIP proxy/server
implementing draft-ibc-rtcweb-sip-websocket or a XMPP server
implementing draft-moffitt-xmpp-over-websocket. The client application
(JavaScript in the web page) should parse the SDP bodies and make
WebRTC calls when appropriate to initiate or answer multimedia
sessions. And then we get full interoperability with SIP/XMPP world
out there (without requiring a server/gateway performing conversion of
application level protocols).

In the same way, other web page which just requires multimedia
sessions between web-browsers, could prefer to implement a simple and
custom JSON format as a signaling mechanism on top of WebSocket (or
use HTTP Comet, long-polling, etc). It could map the SDP definition
into a JSON struct. Again the JavaScript code parses the SDP
information and calls WebRTC API functions to manage multimedia
sessions. The only requirement would be for the HTTP server to
implement WebSocket or HTTP Comet (or nothing if HTTP long polling is
used).

So my proposal is that WebRTC should not mandate a signaling protocol
in the web-browser, but just define a requeriment for managing
multimedia sessions from the JavaScript code given a well defined API.
IMHO this is the way that fits well with the flexibility of the web
and lets each web provider to decide which technology to use as
signaling protocol, rather than forcing him to implement
SIP/XMPP/other-protocol in server side.


Best regards.

-- 
Iñaki Baz Castillo
<ibc@aliax.net>