Re: [rtcweb] Forking & Early Media - Proposal

"Olle E. Johansson" <> Wed, 21 September 2011 07:23 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 1CFC421F8B58 for <>; Wed, 21 Sep 2011 00:23:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.243
X-Spam-Status: No, score=-2.243 tagged_above=-999 required=5 tests=[AWL=0.006, BAYES_00=-2.599, HELO_EQ_SE=0.35]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id g5cZwIFDxYOY for <>; Wed, 21 Sep 2011 00:23:45 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 6007221F8B53 for <>; Wed, 21 Sep 2011 00:23:45 -0700 (PDT)
Received: from [] ( []) by (Postfix) with ESMTPA id 51C55754BCE4; Wed, 21 Sep 2011 07:26:11 +0000 (UTC)
Mime-Version: 1.0 (Apple Message framework v1244.3)
Content-Type: text/plain; charset="us-ascii"
From: "Olle E. Johansson" <>
In-Reply-To: <>
Date: Wed, 21 Sep 2011 09:26:11 +0200
Content-Transfer-Encoding: quoted-printable
Message-Id: <>
References: <> <> <> <> <> <> <> <> <> <>
To: Randell Jesup <>
X-Mailer: Apple Mail (2.1244.3)
Subject: Re: [rtcweb] Forking & Early Media - Proposal
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 21 Sep 2011 07:23:47 -0000


If you want to go down this whole route and include all this application stuff in rtcweb, we might as well include all of SIP, because all of these issues have been worked on in the SIP world for a long time. 

Personally, I see rtcweb as a much lower layer than you do, which in my world excludes all this complexity.

SIP has been around for many years, and some of the discussion you bring to the table hasn't been solved yet or is poorly implemented in end points since they don't understand the complexity. I see no point in trying to solve all of that again with yet another signalling protocol.

We keep going back and forth on the signaling issue here - is it part of RTCweb? If yes, then we'll have forking, early media, DTMF and all that luggage including complex ISDN signalling scenarios that are just part of the German PSTN network and a requirement to have in RTCweb since it works in the ISDN. You'll have reqiuirements that the browse just HAS to support five different SIP Subscribe event packages - or the rtcweb version.

If we put signalling out of scope and let the application builder handle signaling and/or start a separate working group that can define "SIP lite" I believe we can make progress in rtcweb by focusing on a limited set of issues.

Darn. I feel like the old employee saying "we tried that years ago, it did not work." to kill all the inspired people that belive they can make a change. My apologies. Prove me wrong :-)


21 sep 2011 kl. 09:07 skrev Randell Jesup:

> NOTE: Attached below is a proposed set of forking/early-media
> and clipping-avoidance rules, so don't glance and delete!  :-)
> Also note: I started writing this earlier today, so it was largely
> done before much of today's discussion on forking.  I'll note that
> I include in this a method to minimize chances of answer-time
> clipping.  (For any who don't know (if there are any), this is
> where the first fraction of a second after pickup is lost while
> answering, starting codecs, doing ICE, etc.)
> On 9/20/2011 9:40 AM, Olle E. Johansson wrote:
>> 20 sep 2011 kl. 15:15 skrev Christer Holmberg:
>>>>> Once we start requiring that the PeerConnection know the
>>>>> difference between "early" media and "late" media, it seems
>>>>> to me we're slipping down a slippery slope.
>>>> The difference between early and late media is purely a
>>>> billing decision in PSTN. I don't think we should separate
>>>> these on the rtcweb side. It's a PSTN gateway issue, not
>>>> something to be bothered with in rtcweb.
>>> It's not about knowing the difference between "early" and "late" media - it's about whether the API and browser need to support multiple SIMULTANOUS SDP answers - or whether we assume that the JS SIP app will always, at any given time, only provide ONE SDP answer to the API and browser.
>> I just wanted to get rid of the early/late media discussion. As you state, the forking issue with getting multiple responses is a separate issue.
>> Do we have any use cases using forking? Is forking a desired feature or something that SIP brought in?
> No, this is something inherent in a person you want to converse with
> possibly being in different places.  Different phones in a home,
> different computers in a home or out of it (your desktop, your laptop,
> your tablet, your work computer, your Android phone) - when someone
> wants to talk to you on Skype or what have you, often the service will
> want to offer the connection to any and all devices you're logged into
> the service from.  So, it forks the request.  We'd have this issue
> even if we totally disallow SIP and disallow PSTN connectivity.  If
> you require that the website/server handle this and only provide one
> answer, you're much more likely to clip the answer (lose audio right
> after accept while the channels are being opened).
> Two things in particular appear here.  One is early media (I want to
> send media to you but no one has accepted).  I do not propose that
> rtcweb generate early media; some sort of "alerting" notification is
> enough (equivalent to 180).  (Realize that means no custom callback
> tones or video, or weird cases like sitting on hold or in an IVR while
> not actually "in" a call).  If so, we only have to worry about interop
> cases - calling out to legacy, or *maybe* a call forked in rtcweb
> where one of the forks goes to a legacy device or gateway that sends
> early media.
> The other is choosing which answer to accept if multiple arrive; that
> can be up to the application I think (though 99% likely the app will
> want to use the first answer).  I don't think we have to *mandate*
> that the first answer is the one we use though I can't think of any
> cases where we wouldn't, but I'm pretty sure they exist and I wouldn't
> want to outlaw them for no reason).  If it makes any use-cases easier
> to mandate the first answer, that may change my opinion.  If you're
> using SIP (JS or not) that might affect the answer, of course.
> While waiting for an acceptance, it makes *lots* of sense to "warm up"
> the connection(s) so that when the call is accepted there's minimal
> delay or pickup loss.  "warm up" means to do an ICE exchange and
> possibly even instantiate codecs, etc.  This is complicated by not
> knowing the final answer until the user decides how to answer, but you
> could warm up the likely streams/codecs in most cases, and drop some
> if needed on ACCEPT.  In the forking case, you could warm up
> connections to some or all possible answers.  (Pacing may be an issue
> here, but often there are 5-20 seconds to do it in.)
> Implicit in this is separating ANSWERs from "acceptance", and
> verifying on "acceptance" that the correct ANSWER is used (for
> example, we warm up audio and video, and the person answers
> audio-only, or for some reason chooses a different codec).
> So, to summarize in psuedo-spec language:
> 0)   I'm assuming an Offer-Answer model here, though not assuming SDP.
>     If you want, read "SDP ANSWER" for "ANSWER", etc to map to Harald's
>     proposals.  Note that I add "ACCEPT".
> 0.1) Rough mapping to SIP:
>     a) INVITE ->  OFFER
>     b) 183 ->  ANSWER
>     c) 180 ->  ANSWER-with-no-media-streams
>     c) 200 ->  ANSWER (may be suppressed) + ACCEPT
> 0.2) I'm assuming OFFERs and ANSWERs and ACCEPTs are delivered on
>     a reliable, in-order channel.
> 1) webrtc clients WILL NOT send early media
>   [See below; I see no real need for webrtc<->webrtc client connections
>    to send early media, but SIP/PSTN interop cases may require it, so
>    I have an alternative below]
> 2) when a webrtc client receives a OFFER, it MAY generate a speculative
>   ANSWER in order to allow pre-starting the PeerConnection in a disabled
>   state.  If pre-started, NO media shall be sent until the call has been
>   ACCEPTED.  Note that the OFFERer may receive data before seeing
>   the ACCEPT.
> 3) if the ANSWERer generated a speculative ANSWER, it may replace that
>   with an alternative ANSWER before sending ACCEPT.  This alternative
>   SHOULD use the same connection address as the original, and if so
>   the existing PeerConnection established or being established SHOULD
>   be retained, but the mediastream configuration changed to match
>   the new ANSWER.
> 4) the OFFERer SHOULD pre-start PeerConnections on a speculative ANSWER, or
>   they MAY wait until an ACCEPT and then start the last ANSWER from that
>   source.  If multiple sources supply speculative ANSWERs, the OFFERer
>   MAY pre-start some, none or all of them as it wishes.
>   [Open question: do we pre-start MediaStreams in each pre-starting
>    PeerConnection, or do (can) we defer this until ACCEPT?]
> 5) when the OFFERer receives an ACCEPT, it MAY close other PeerConnections
>   opened speculatively.
> 6) when an ANSWERer sends an accept, it MAY begin sending media immediately
>   if the PeerConnection was pre-started.  It SHOULD be ready to receive
>   media before sending the ACCEPT.
> 7) servers handling signalling for webrtc clients MAY fork a call offer
>   to multiple webrtc clients
> 8) if a call is forked, the webrtc client MAY receive either a single
>   ANSWER and ACCEPT, or MAY receive multiple ANSWERs with one or more
>   ACCEPTs, depending on how the server works.
> The provides a way to minimize the chances of start-of-call clipping,
> and handles forking with minimal clipping (with cooperation of the
> app).  Note that there may be a implementation limit on the number of
> PeerConnections that can be "warmed up" before an ACCEPT.
> Yes, if we remove 1) and replace it with (probably lower down)
> N) webrtc clients MAY send "early media" on a pre-started PeerConnection
>   but MUST NOT send any media without explicit action or consent of the
>   user.  webrtc clients MAY play the early media.
> or
> N) webrtc media gateways MAY send "early media" on a pre-started PeerConnection,
>   and webrtc clients receiving "early media" MAY play it, and MAY send
>   media (such as DTMF) but MUST NOT send any media without explicit
>   action or consent of the user.
>  (and you have to change 2) above)
> you get something that is pretty interoperable with legacy SIP devices
> and especially PSTN gateways or border controllers, including the infamous
> American Airlines DTMF trick.  This assumes a WebRTC<->legacy media gateway
> is in use (note that all the above is about PeerConnections).  I have not
> tried to figure out how non-gatewayed legacy would work into this, but it
> should be doable.
> -- 
> Randell Jesup
> _______________________________________________
> rtcweb mailing list

* Olle E Johansson -
* Cell phone +46 70 593 68 51, Office +46 8 96 40 20, Sweden