Re: [rtcweb] Proposal for a JS API for NoPlan (adding multiple sources without encoding them in SDP)

Iñaki Baz Castillo <> Tue, 18 June 2013 11:30 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 48C6521F9EBC for <>; Tue, 18 Jun 2013 04:30:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.37
X-Spam-Status: No, score=-1.37 tagged_above=-999 required=5 tests=[AWL=-0.292, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, J_CHICKENPOX_18=0.6, MIME_8BIT_HEADER=0.3, NO_RELAYS=-0.001]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id 5vet8pF9tzOi for <>; Tue, 18 Jun 2013 04:30:29 -0700 (PDT)
Received: from ( [IPv6:2607:f8b0:400d:c01::234]) by (Postfix) with ESMTP id 0A70721F9EAF for <>; Tue, 18 Jun 2013 04:30:28 -0700 (PDT)
Received: by with SMTP id a1so2191113qcx.39 for <>; Tue, 18 Jun 2013 04:30:28 -0700 (PDT)
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type:content-transfer-encoding:x-gm-message-state; bh=dUcz5QMDiWQhgBfuPoJs6oXXZDu++kr1baP3v9Mwkqg=; b=edPiufukEbICJ9zXN5jfESfW3z8I+n6ndUN/gGg1zVX8jmp9LLY5lU8IWfMLXNvOQd 1VONU4vBgXzr4mqjuNb+ch0giXK1VIX0bsu84ki7rFOZ8SkIKuN/iIc+ICr2TbRTqP4t 6+/K21ir/ma60V3PBh7uPQmNhGMYUEQvMdy80mCcMTCSRyl5EsPw7gT/njA3cwHmAiBu 6Yei0KpSyiqyR3kL+p+9K10jqAldU4gb69Ws/kcOp519LnUQGUn9NwSyYlQsyATjVrAO SUT6TXcPMIf+3h+qhjMhKIFojj+Q3MeTNz2VEemgzYh+F5q9CNkzuwly3U27Jx9bzTlY pxBA==
X-Received: by with SMTP id k7mr8376217qcv.129.1371555028172; Tue, 18 Jun 2013 04:30:28 -0700 (PDT)
MIME-Version: 1.0
Received: by with HTTP; Tue, 18 Jun 2013 04:30:07 -0700 (PDT)
In-Reply-To: <>
References: <> <> <> <> <> <> <> <> <> <> <>
From: Iñaki Baz Castillo <>
Date: Tue, 18 Jun 2013 13:30:07 +0200
Message-ID: <>
To: Emil Ivov <>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
X-Gm-Message-State: ALoCoQlUgSiQi0t1AFCYmos87D6+/d9aQ6y5rSbySjXTt995WCzP/2wKWMWwURxz7xOL0O6yx1te
Cc: "" <>
Subject: Re: [rtcweb] Proposal for a JS API for NoPlan (adding multiple sources without encoding them in SDP)
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 18 Jun 2013 11:30:30 -0000

2013/6/18 Emil Ivov <>:
> This is exactly what No Plan tries to avoid. It gets you
> interoperability for widely deployed legacy (as in SIP phones, popular
> free SBCs and PSTN) and then it lets you do what you want for all the
> fancy use cases that you are going to come up with. You are in no way
> hampered by SDP this way.

Hi Emil,

I understand your aim and really appreciate your effort (as I said
before, it is much better than fully depending on SDP for any media
operation). However let me expose the following questions, and I would
be very grateful if you answer inline:

* SIP over WebSocket JS app running in the browser (so SIP signaling
is used in the wire).

* I send the initial INVITE with the SDP I retrieve from the local PC
(I am the JS app).

* Later I want to put the remote peer on hold, so I need to send a
re-INVITE with same SDP but some differences:

- Increased SDP version number.
- a=inactive in every m= sections.

* How to do that? As far as I can imagine:

- I need to keep the initial SDP or retrieve it again from the PC, so
I get a string blob. I can "expect" how such a string blob looks, but
I cannot be sure since the same SDP can be represented in multiple

- So I need to *parse* the SDP at JS level, right? and then I need to
mangle it (for increasing version number and adding a=inactive),

- And then I must pause/stop my local streams (via PC API call),
generate the re-INVITE with the mangled SDP, and send it to the
remote, right?

- Then the remote peer receives it. Since it is also a SIP JS app, it
must *parse* the SDP at JS level and realize that there are a=inactive
lines in every m= sections, right? Then it could notify the human user
that "the remote has put the call on hold".

- Then, should the remote JS app pass the SDP to its PC? I expect not
(just wondering).

- And finally, the remote peer (its JS SIP app) should also mangle its
initial SDP for increasing the version number and adding a=inactive or
a=recvonly in every m= sections, right? and then send it into a SIP
200 OK.

So, unless I'm fully wrong, a simple use case (hold/unhold application
so widely extended in SIP) requires:

* SDP parsing at JS.
* SDP mangling at JS.

And such a parsing and mangling must be done over a string blob that
can have multiple representations (which depends on each WebRTC
implementation). Even worse, some non-browser-WebRTC-devices may add
a=inactive/sendonly before the m= sections (global SDP attribute), so
my JS app must be ready to detect it, right?

Does it really look nice? IMHO the mechanism you propose makes WebRTC
easy for very-very-very-ultra-basic use cases (basically an
audio/video call with no media changes) and makes it really complex
for implementing any kind of media modification or media application
(i.e. "hold" / "unhold").

Another example:

* I am a powerful SIP conference server which properly implements
WebRTC. I initiate a call to 5 users (running JS SIP app in their
browsers). The initial INVITE has SSRC/MSID fields in the SDP
identifying all the participants, am I right?

* Later, during the conference, I call to another 6th participant and
enter him into the conference, so I need to send a re-INVITE to every
participant with a modified version of the SDP (note that this is SIP
protocol, so I need to use SIP messages to carry the new info about
SSRC/MSID and so on).

* Magically I (the server) create the new SDP with the SSRC/MSID of
the new stream associated to the new participant, and send it into a
re-INVITE to every participant.

* Now a participant receives it and must be able to know what has been
modified (in order to render it in the HTML, i.e. by drawing a new
<video> element). Your draft states that those media modifications
(track additions) don't require re-sending a SDP, but in a SIP context
this must be done with a new SDP, so:

- The JS SIP app needs to *parse* the SDP at JS level, which is a
complex task. It must realize of the new SSRC/MSID (but it should also
detect whether an existing participant has leave the conference).

- Then it draws the new <video> element for showing the video of the
new participant.

- And some WebRTC events would be automatically fired in the remote JS
app due to the Track modifications, which is detected by the WebRTC
stack due to the presence of new SSRC/MSID in the RTP, right?

- And then the remote JS app must create a new SDP (a modified/mangled
version of the SDP in the initial SIP 200 OK response) with these

  - New version value.
  - Probably new lines/attributes due to the new SSRC/MSID (not sure about it).
  - And all of this at JS level by mangling the string blob.

- And send it back to the conference server in a SIP 200 response.

Well, IMHO this is the most tedious and error prune mechanism I've
ever seen. How to make it simpler?:

- WebRTC PC should not return a SDP (nor expect a SDP string), but
just a JS object with the required attributes.

- The JS app then builds a SDP (if it needs it due to the signaling
protocol as in the examples above).

- In this way the JS app has full control over the SDP associated to
the session and can safely modify it.

- And in 99% of WebRTC use cases (where no SIP is involved) the JS app
does not need to deal with a string blob retrieved from the

So, why not let the SDP-game for JS libraries? In this way, a
telco/provider/vendor could offer a JS app that does the required
magic to interop with its media gateways rather than leaving that
responsibility to browsers.

Please take a look to this article which claims more or less the same:


Iñaki Baz Castillo