Re: [rtcweb] Minimal SDP negotiation mechanism

Hadriel Kaplan <> Tue, 20 September 2011 21:43 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id C3C3A1F0C91 for <>; Tue, 20 Sep 2011 14:43:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.524
X-Spam-Status: No, score=-2.524 tagged_above=-999 required=5 tests=[AWL=0.075, BAYES_00=-2.599]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id Iv79g3WvTqNC for <>; Tue, 20 Sep 2011 14:43:04 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 51C051F0C67 for <>; Tue, 20 Sep 2011 14:43:04 -0700 (PDT)
Received: from ( by ( with Microsoft SMTP Server (TLS) id; Tue, 20 Sep 2011 17:45:30 -0400
Received: from ([]) by ([]) with mapi id 14.01.0270.001; Tue, 20 Sep 2011 17:45:29 -0400
From: Hadriel Kaplan <>
To: Roman Shpount <>
Thread-Topic: [rtcweb] Minimal SDP negotiation mechanism
Thread-Index: AQHMd96dDsyA7D2L20yFLQluaSt5yg==
Date: Tue, 20 Sep 2011 21:45:29 +0000
Message-ID: <>
References: <> <> <> <> <> <> <> <> <> <>
In-Reply-To: <>
Accept-Language: en-US
Content-Language: en-US
x-originating-ip: []
Content-Type: text/plain; charset="iso-8859-1"
Content-ID: <>
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-Brightmail-Tracker: AAAAAQAAAWE=
Cc: "" <>
Subject: Re: [rtcweb] Minimal SDP negotiation mechanism
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 20 Sep 2011 21:43:08 -0000

Yup, that pretty much sums it up. :)
Some minor comments inline...

On Sep 20, 2011, at 10:53 AM, Roman Shpount wrote:

> There are actually three related issues here:
> 1. As mentioned earlier, a single offer in SIP can create multiple dialogs with independent answers. Each dialog is independent from each other and it is up to end user device to choose how to render it (mix audio, play the audio from the latest, display multiple videos side by side). These dialogs can be early (created by a provisional response) or final (created by a success response), but this distinction typically does not affect the media plane. There are a number of common use cases for forking with multiple answers such as service announcements (play some announcement from a media server using an early dialog then start playing audio from the call using another dialog), color ring back (play custom music while dialing the user), find me/follow me (where call can be answered by the desk and cell phone creating two final dialogs). One of the biggest design issues with such multiple answer fork scenarios is that there is no way to map received media to the actual answer, since there is nothing in the answer SDP which identifies the response RTP stream.

So far the discussion in rtcweb has been that ICE will be required.  As such, you can actually correlate received media with the SDP, because the username in the STUN connectivity checks from the remote peers will be uniquely indicated in the SDP from them, so media received on the same 5-tuple is from that peer. (once you get the SDP answer(s) of course, which per ICE should be rather quickly)

If we don't require ICE, we're still gonna require symmetric RTP, so the SDP answer's c/m-line info could be used to correlate the media in many cases.

> 2. The second scenario is non-standard, but very widely used -- in the same dialog, different responses are send in the provisional and final SIP responses. This is usually a result of forking being masked by a B2BUA or SBC. Even though this is non-standard this is usually trivial to implement, since all that needs to be done is to reinitialize the media stream based on the new answer.
> 3. Even when multiple answers are not used, multiple different media streams can be sent based on the offer. First of all, it is common to receive media before the signaling response is received. Second, standard requires that offerer should be ready to receive media once the offer is sent and there is nothing that prevents multiple media streams from being sent to it. Furthermore, multiple answers only define what media and where should be sent to the answerer, but in no way specifies how many remote parties, from which addresses, and what type of media will send the media to the offerer. New media streams with new SSRC from new addresses with new media types present in the offer can be added at any time without the offer/answer exchange. Typically this is supported in such a way that only the newest media stream, after probation interval, is played and older media streams are ignored, but other more robust solutions are possible.
> If we plan to interop with existing VoIP solutions, we will need to support 1 and 2, as well as define the expected behavior for 3.
> _____________
> Roman Shpount