Re: [rtcweb] Call for Consensus on Use Case for Screen/Application/Desktop sharing

Randell Jesup <> Tue, 20 September 2011 15:24 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 2F16621F8C9F for <>; Tue, 20 Sep 2011 08:24:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.614
X-Spam-Status: No, score=-2.614 tagged_above=-999 required=5 tests=[AWL=-0.015, BAYES_00=-2.599]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id vsi0cdbqp7k6 for <>; Tue, 20 Sep 2011 08:24:44 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id 4820F21F8C5D for <>; Tue, 20 Sep 2011 08:24:44 -0700 (PDT)
Received: from ([] helo=[]) by with esmtpsa (TLSv1:AES256-SHA:256) (Exim 4.69) (envelope-from <>) id 1R62E2-0006ko-0q for; Tue, 20 Sep 2011 10:27:10 -0500
Message-ID: <>
Date: Tue, 20 Sep 2011 11:23:46 -0400
From: Randell Jesup <>
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:6.0.1) Gecko/20110830 Thunderbird/6.0.1
MIME-Version: 1.0
References: <> <> <> <>
In-Reply-To: <>
Content-Type: text/plain; charset="UTF-8"; format="flowed"
Content-Transfer-Encoding: 8bit
X-AntiAbuse: This header was added to track abuse, please include it with any abuse report
X-AntiAbuse: Primary Hostname -
X-AntiAbuse: Original Domain -
X-AntiAbuse: Originator/Caller UID/GID - [47 12] / [47 12]
X-AntiAbuse: Sender Address Domain -
Subject: Re: [rtcweb] Call for Consensus on Use Case for Screen/Application/Desktop sharing
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Tue, 20 Sep 2011 15:24:45 -0000

On 9/20/2011 8:12 AM, Emil Ivov wrote:
> На 19.09.11 09:50, Olle E. Johansson написа:
>>>> B) Where a remote peer can provide one or more input types such
>>>> as mouse and keyboard to control the local system, not only
>>>> including the browser, but also other operating system resources.
>>>> This clearly can only happen after additional consent, most
>>>> likely on a per occasion consent.
>>> I see this as a  more tricky thing to get right (in most apps, the
>>> mixing of events from multiple sources depends strongly on both
>>> proper timing/sequencing and reliable delivery). I would like to
>>> not address this for now (RTCWeb version 1).
>> I think it's a good use case for the data channel. How many such use
>> cases do we have? While use case A is quite often handled as a normal
>> video stream, use case B is likely something like VNC. This is an
>> application that is part of Microsoft Lync as well as the free SIP
>> client Blink today.
> I don't see both as having separate implementations. We could very well
> have browsers stream the desktop as a regular video flow in one
> direction and then add to that (or not) user feedback in the opposite
> direction. The feedback could go over signalling (which is what we are
> doing in Jitsi), over RTP (something like RFC 4733's DTMF) or over
> Pseudo TCP.
> Either way I think it's something quite important and I don't think
> there's a good reason to leave it for later.
> I would be happy to work on that.
Great!  There is one fundamental question to answer about this use-case:
How does it work in the existing JS security model?

For B) to work, it not only has to accept mouse/keyboard/etc input from
the other side, but it must take that and inject it into local browser or

The only way I can see to get that to work without a *massive* breach in the
JS security model would be to have it be something JS can't touch directly,
but like a codec is routed (with user permission) to the playback mechanism.
And even there the permission from the user would have to be very carefully
tied to the source.

This would mean that the format for all the data on the wire would need to
be specified and standardized; it can't easily be an application-specific
protocol. (Ok, I can see one painful, indirect way to avoid our building a
remote desktop wire protocol, which is to give a way for the JS app to provide
the message format in a non-executable manner that we could use to transform
the wire data into user-input data.  But that's really no win for us.)  Or
one has to standardize it as a "codec", but that's complicated by not wanting
to deliver it over an unreliable channel.

The alternative would be to have the JS app be (VERY) trusted, which so far
is not the threat model we're using.  Perhaps once again this ties into whatever
security model will be used for user-installed web-apps, since those in many
cases will be taking the place of trusted binary installed apps.

So, looking into this would be great, but I suspect it we can't put it into
the first spec.  It could be a follow-on, perhaps under the webapp security
model.  (My apologies; I know that there's lots of work going on in that
space, but I haven't been following it closely.)

While I'd love to see both A and B in WebRTC 1.0, I just don't see B as being
feasible in this timeframe.  Feel free to show I'm wrong, though!

Randell Jesup