Re: [rtcweb] Making progress on the signaling discussion (NB: Action items enclosed!)

Iñaki Baz Castillo <> Wed, 05 October 2011 09:16 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 3A9D321F8B76 for <>; Wed, 5 Oct 2011 02:16:48 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.633
X-Spam-Status: No, score=-2.633 tagged_above=-999 required=5 tests=[AWL=0.044, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, MIME_8BIT_HEADER=0.3, RCVD_IN_DNSWL_LOW=-1]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id B9lTWguwkoqq for <>; Wed, 5 Oct 2011 02:16:47 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 5544021F8B72 for <>; Wed, 5 Oct 2011 02:16:47 -0700 (PDT)
Received: by vws5 with SMTP id 5so1399253vws.31 for <>; Wed, 05 Oct 2011 02:19:53 -0700 (PDT)
MIME-Version: 1.0
Received: by with SMTP id bp17mr2071307vdb.447.1317806393302; Wed, 05 Oct 2011 02:19:53 -0700 (PDT)
Received: by with HTTP; Wed, 5 Oct 2011 02:19:53 -0700 (PDT)
In-Reply-To: <>
References: <> <> <>
Date: Wed, 5 Oct 2011 11:19:53 +0200
Message-ID: <>
From: =?UTF-8?Q?I=C3=B1aki_Baz_Castillo?= <>
To: Ted Hardie <>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: quoted-printable
Subject: Re: [rtcweb] Making progress on the signaling discussion (NB: Action items enclosed!)
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 05 Oct 2011 09:16:48 -0000

2011/10/5 Ted Hardie <>om>:
> Hi Iñaki,
> The chairs don't currently detect consensus on how signaling will be handled
> for RTCWeb sessions.  We don't want to circumscribe the solution space, but
> we do feel that there is a need to have concrete proposals, rather than
> broad statements like, "it shouldn't be in the component X".  Concrete
> proposals for how it will be handled are the best way we see to make sure
> folks are coming to consensus on something they understand, rather than on
> rhetoric.
> If you would like to make a concrete proposal for how that JavaScript object
> is constructed, what it contains, and how it is shipped around, that would
> be great.   Without a statement of what those details are, however, the
> chairs worry that people will argue (for and against) proposals that have
> not been made.
> We await your electrons with great interest!

Hi Ted, I'll very busy next days but next week I'll try to give
something more useful.

Anyhow, I think we must be careful here. Maybe somebody proposes a
signaling mechanism that "works for him" but avoids others to build
their own signaling (if just those requirements are included in rtcweb

For example, I've seen recently a proposal for SDP managing in which
the client (the JS code in the browser) is a "stupid" element (sorry
for the word) and asks the server (via HTTP) for all the stuf related
to SDP. Maybe this could work for its specific use case, but IMHO we
should not rely on SDP processing/validating/whatever in server side.
We have now all the tools to make a web browser + JS code an
intelligent element. The web page is not just a visual interface
anymore. Realtime signaling (via WebSocket) and media (rtcweb) is
already possible (or will be), and JavaScript allows building a real
application in client side.

In my case (SIP over WebSocket) the client, a JavaScript SIP stack, is
a pure SIP client (which can also parse SDP bodies), and the "server"
is in fact a *pure* SIP proxy implementing the WebSocket transport
(along with UDP, TCP and so). The WS server (so the SIP proxy) does
*nothing* special for clients on a web browser (it does not change the
SDP, it does not behave as a SIP B2BUA, it just routes SIP messages).
WebSocket is just a new SIP transport, nothing else. I will show it
next week.

So I strongly need that the rtcweb client (the web browser) is able to
deal with SDP bodies (or JS objects mapping the information in a plain
or XML SDP) and can make WebRTC calls via a standarized JS API to deal
with SDP's and sessions.

For example, let's suppose I'm the JavaScript custom code running in a
browser and implementing SIP over WebSocket. Me (the JavaScript code)
is speaking to the WebRTC JS API:

- I receive a SIP INVITE from a peer and render it to my human user
("Bob is calling you, answer? reject?"). The human user accepts the

- I've received an SDP body from the peer (in the incoming INVITE via
WebSocket). WebRTC stack, please parse it and give me a JS object
representing it:

  var remote_sdp = new WebRTC.SDP(string, "plain");

- I examine the SDP object and realize that the peer is offering audio
and video.

- Now please generate my SDP as a reply for the received SDP:

  var my_sdp = WebRTC.SDP.answerFor(remote_sdp);

- ...but discard the video offered by the peer.


- Now I need the plain representation of my SDP (to be included within
the SIP 200 OK response):

  var my_plain_sdp = my_sdp.toPlain();

- I send the SIP 200 OK response via WebSocket.

- Now start the real media session!:

  var session = new WebRTC.Session(my_sdp, remote_sdp);

- After a while I want to invite the peer to a video session (over the
existing audio session). So first I add the video offer in my SDP
object (it would also increment the sessid field in the SDP):


- Then regenerate my plain SDP to be included in a re-INVITE to be
sent to the peer:

  var my_plain_sdp = my_sdp.toPlain();

- I get a 200 OK response with a new remote SDP, so parse it into a JS object:

  var remote_sdp = new WebRTC.SDP(string, "plain");

- I examine the remote SDP to tell my human user wheter the video
session has been accepted by the peer or not.

  if (remote_sdp.hasStream("video"))
    alert("yeah! the peer accepted video!");

- And now tell the rtcweb stack to update the existing media session:

  session.update(my_sdp, remote_sdp);

- Now I want to put on hold the call, so:


- Then regenerate my plain SDP to be included in a re-INVITE to be
sent to the peer:

  var my_plain_sdp = my_sdp.toPlain()

- ...and so on...

NOTE: The above is just pseudo-code and of course must be terribly improved.

Said that, I think that stating the requirements for *any* signaling
mechanism should be the key right now, rather than proposing specific
signaling mechanism that "works for me and covers my specific needs
(and I don't care about others)". IMHO defining the needs in the
WebRTC API is the only important now (IMHO).

Best regards.

Iñaki Baz Castillo