[rtcweb] My Opinion: Why I think a negotiating protocol is a Good Thing

Harald Alvestrand <harald@alvestrand.no> Tue, 18 October 2011 11:43 UTC

Return-Path: <harald@alvestrand.no>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (localhost []) by ietfa.amsl.com (Postfix) with ESMTP id CF1C921F8B64 for <rtcweb@ietfa.amsl.com>; Tue, 18 Oct 2011 04:43:57 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -110.577
X-Spam-Status: No, score=-110.577 tagged_above=-999 required=5 tests=[AWL=0.022, BAYES_00=-2.599, RCVD_IN_DNSWL_HI=-8, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([]) by localhost (ietfa.amsl.com []) (amavisd-new, port 10024) with ESMTP id Sb0SWkCx2I+m for <rtcweb@ietfa.amsl.com>; Tue, 18 Oct 2011 04:43:57 -0700 (PDT)
Received: from eikenes.alvestrand.no (eikenes.alvestrand.no []) by ietfa.amsl.com (Postfix) with ESMTP id D20CA21F8B14 for <rtcweb@ietf.org>; Tue, 18 Oct 2011 04:43:56 -0700 (PDT)
Received: from localhost (localhost []) by eikenes.alvestrand.no (Postfix) with ESMTP id C23DE39E151 for <rtcweb@ietf.org>; Tue, 18 Oct 2011 13:43:55 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at eikenes.alvestrand.no
Received: from eikenes.alvestrand.no ([]) by localhost (eikenes.alvestrand.no []) (amavisd-new, port 10024) with ESMTP id Rf4+EGXMP303 for <rtcweb@ietf.org>; Tue, 18 Oct 2011 13:43:55 +0200 (CEST)
Received: from hta-dell.lul.corp.google.com (62-20-124-50.customer.telia.com []) by eikenes.alvestrand.no (Postfix) with ESMTPS id 1B44A39E0CD for <rtcweb@ietf.org>; Tue, 18 Oct 2011 13:43:55 +0200 (CEST)
Message-ID: <4E9D667A.2040703@alvestrand.no>
Date: Tue, 18 Oct 2011 13:43:54 +0200
From: Harald Alvestrand <harald@alvestrand.no>
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv: Gecko/20110921 Thunderbird/3.1.15
MIME-Version: 1.0
To: "rtcweb@ietf.org" <rtcweb@ietf.org>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 8bit
Subject: [rtcweb] My Opinion: Why I think a negotiating protocol is a Good Thing
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Oct 2011 11:43:57 -0000

(Apologies for the length of this note - but I wanted to follow this 
argument a bit deeply)

In the discussions about RTCWEB signalling, I’ve had a change of heart.
Or perhaps it’s just a change of terminology.

The high bit is this:
I believe we need to standardize a negotiation protocol as part of RTCWEB.
This needs to be described as a protocol, and cannot usefully be 
described as an API.

This note is to explain to my fellow WG members what led me to this 
conclusion, and - in some detail - what I think the words in the above 
paragraph mean.
Those who don't want to read a long message can stop here.

The context of RTCWEB is the well known trapezoid:

+-----------+ +-----------+
| Web | | Web |
| | Signalling | |
| |-------------| |
| Server | path | Server |
| | | |
+-----------+ +-----------+
/ \
/ \ Proprietary over
/ \ HTTP/Websockets
/ \
/ Proprietary over \
/ HTTP/Websockets \
/ \
+-----------+ +-----------+
+-----------+ +-----------+
+-----------+ +-----------+
| | | |
| | | |
| Browser | ------------------------- | Browser |
| | Media path | |
| | | |
+-----------+ +-----------+

or even the triangle, where the triangle is formed when the two Web 
severs on top are collapsed into one.

A design criterion for RTCWEB has been that it should be possible to 
write applications on top of RTCWEB simply - that is, without deep 
knowledge about the world of codecs, RTP sessions and the like.
Another design criterion is that interworking should be possible - which 
means that SOMEWHERE in the system, deep knowledge about the world of 
codecs, RTP sessions and the like must be embedded; we can’t just 
simplify our options until everything’s simple.

There’s one place in the ecosystem where this knowledge HAS to be - and 
that is within the browser, the component that takes care of 
implementing the codecs, the RTP sessions and the related features. If 
we can avoid requiring embedding it twice, that’s a feature.

It used to be that I was a believer in APIs - that we should make the 
API the “king”, and describe the way you generate an RTP session as “you 
turn this knob, and get this effect”.
After looking at the problem of Web applications that don’t have domain 
knowledge for a while, I’ve concluded that this doesn’t work. There’s 
the need for one browser to communicate with the other browser, and if 
the intermediate components are to have the ability to ignore the 
details, what’s communicated has to be treated like a blob - something 
passed out of one browser’s API and into another browser’s API, 
unchanged by the application - because the application must be able to 
ignore the details.

OK, now we have the API with blobs. We also have to make some 
assumptions about how those blobs are transported, who’s responsible for 
acting on them, and so on. And we have to make sure different browsers 
implement the blob the same way - that is, it has to be standardized.
What’s more - we DO want to enable applications that are NOT simple. 
Including gateways, which are not browsers. So applications must be free 
to look inside the blob - break the blob boundary - when they need to. 
So this pulls in the same direction as the need for interoperability - 
the format, semantics and handling rules for these blobs has to be 
specified. In some detail.

So we have:
- a data format
- a transmission path
- a set of handling rules, in the form of states, processes and timers

This doesn’t look like an API any more. This looks like a protocol. 
We’ve got experience in describing protocols - but it becomes much 
easier to apply that experience when we call it a protocol.

Let’s do that.

But - you did not address my point...
No, I didn’t. But here are some points that might bear watching.

“We shouldn’t mandate a new protocol on the wire”.
We’re not. We’re specifying one, and mandating it for a specific point: 
The browser/JS interface.
We can have applications that parse it locally using a JS library; in 
that case, the protocol runs between the browser and the local JS 
We can have applications that pass the blobs to a server for gatewaying 
into something else; in that case, the protocol runs between the browser 
and the server.
We can have applications that pass the blobs (possibly via a very simple 
server) to another browser, unchanged; in that case, the protocol runs 
between the two browsers.

In no case does the browser need to know where the other end is; all it 
cares about is that the messages flow according to protocol.

“We should build knobs and let JS libraries take care of what’s on top 
of them”.
Well - we could. But it violates the idea of only having the domain 
knowledge present once.
If you have to have a browser that implements codec A, and a JS library 
that knows how to control codec A, you are *requiring* knowledge in two 
places. That’s not optimal.
Things may change in the future - once downloadable codecs actually 
work, downloadable negotiation blobs may be a reasonable counterpart. 
But we’re not there now.

“We should use existing protocol X”
We could. But then, we’d buy into all the baggage of protocol X - both 
the things it requires us to do and the things that it won’t let us do. 
Doesn’t sound too good to me.