Re: [rtcweb] PPID, UTF-16 and DOMString (Re: RTCWEB Data Channel: Usage of PPID for protocol multiplexing)

Harald Alvestrand <harald@alvestrand.no> Mon, 10 February 2014 04:57 UTC

Return-Path: <harald@alvestrand.no>
X-Original-To: rtcweb@ietfa.amsl.com
Delivered-To: rtcweb@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E67E41A0237 for <rtcweb@ietfa.amsl.com>; Sun, 9 Feb 2014 20:57:03 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.344
X-Spam-Level:
X-Spam-Status: No, score=0.344 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FH_RELAY_NODNS=1.451, RDNS_NONE=0.793] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QRjMH6rVQ58e for <rtcweb@ietfa.amsl.com>; Sun, 9 Feb 2014 20:57:00 -0800 (PST)
Received: from mork.alvestrand.no (unknown [IPv6:2001:700:1:2::117]) by ietfa.amsl.com (Postfix) with ESMTP id 80B351A0685 for <rtcweb@ietf.org>; Sun, 9 Feb 2014 20:57:00 -0800 (PST)
Received: from localhost (localhost [127.0.0.1]) by mork.alvestrand.no (Postfix) with ESMTP id ECF197C4BC3 for <rtcweb@ietf.org>; Mon, 10 Feb 2014 05:56:59 +0100 (CET)
Received: from mork.alvestrand.no ([127.0.0.1]) by localhost (mork.alvestrand.no [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IdQxK68FDhm4 for <rtcweb@ietf.org>; Mon, 10 Feb 2014 05:56:59 +0100 (CET)
Received: from [10.1.1.234] (64-71-23-98.static.wiline.com [64.71.23.98]) by mork.alvestrand.no (Postfix) with ESMTPSA id 6221F7C4B9E for <rtcweb@ietf.org>; Mon, 10 Feb 2014 05:56:59 +0100 (CET)
Message-ID: <52F85C14.7010904@alvestrand.no>
Date: Mon, 10 Feb 2014 05:56:52 +0100
From: Harald Alvestrand <harald@alvestrand.no>
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.2.0
MIME-Version: 1.0
To: rtcweb@ietf.org
References: <7594FB04B1934943A5C02806D1A2204B1D15E955@ESESSMB209.ericsson.se> <74072016-DA11-41E8-9944-779428163EE4@lurchi.franken.de> <7594FB04B1934943A5C02806D1A2204B1D15ED94@ESESSMB209.ericsson.se> <E1FE4C082A89A246A11D7F32A95A17826DFCF6C8@US70UWXCHMBA02.zam.alcatel-lucent.com> <B304F67A-9EA2-44A0-86DD-9DD0E21CB86F@lurchi.franken.de> <52F4182F.60404@alvestrand.no> <E1FE4C082A89A246A11D7F32A95A17826DFD40C3@US70UWXCHMBA02.zam.alcatel-lucent.com>, <CABkgnnXMYt7teSpxm7chTvQPR8ThKzKQ_bq3Po_FNFv2tdBFGQ@mail.gmail.com> <7594FB04B1934943A5C02806D1A2204B1D163B26@ESESSMB209.ericsson.se> <E1FE4C082A89A246A11D7F32A95A17826DFD4581@US70UWXCHMBA02.zam.alcatel-lucent.com>
In-Reply-To: <E1FE4C082A89A246A11D7F32A95A17826DFD4581@US70UWXCHMBA02.zam.alcatel-lucent.com>
X-Enigmail-Version: 1.6
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
Subject: Re: [rtcweb] PPID, UTF-16 and DOMString (Re: RTCWEB Data Channel: Usage of PPID for protocol multiplexing)
X-BeenThere: rtcweb@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Real-Time Communication in WEB-browsers working group list <rtcweb.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/rtcweb/>
List-Post: <mailto:rtcweb@ietf.org>
List-Help: <mailto:rtcweb-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtcweb>, <mailto:rtcweb-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 10 Feb 2014 04:57:04 -0000

On 02/08/2014 02:14 AM, Makaraju, Maridi Raju (Raju) wrote:
> That's good. Glad to see that no one liked the complex alternate option... yet! :-)
> I just put it there for completeness and some discussion.
> With "utf-8 on wire", which I too think is the right approach, there is some spec work todo:
> 1. Define IANA utf-8 ppid. Creating a new one is better instead of replacing existing DOMString as DOMString may be used by browsers for sometime to come and also to have backward compatibility/interworking.

Have you ever seen UTF-16 on the wire in the DataChannel protocol?

> 2. Data channel core spec (http://tools.ietf.org/html/draft-ietf-rtcweb-data-channel-06#section-6.6) need to specify explictly the wire format is utf-8.

Fully agreed.

>
> Then browsers need to do the conversion from utf-16 DOMstring to utf-8 before writing on wire also in the reverse direction. 
>
> -Raju
>
>> -----Original Message-----
>> From: Christer Holmberg [mailto:christer.holmberg@ericsson.com]
>> Sent: Friday, February 07, 2014 6:01 PM
>> To: Martin Thomson; Makaraju, Maridi Raju (Raju)
>> Cc: rtcweb@ietf.org
>> Subject: RE: [rtcweb] PPID, UTF-16 and DOMString (Re: RTCWEB Data Channel:
>> Usage of PPID for protocol multiplexing)
>>
>>
>> + 1
>>
>> Regards,
>>
>> Christer
>>
>> ________________________________________
>> From: rtcweb [rtcweb-bounces@ietf.org] on behalf of Martin Thomson
>> [martin.thomson@gmail.com]
>> Sent: Saturday, 08 February 2014 1:45 AM
>> To: Makaraju, Maridi Raju (Raju)
>> Cc: rtcweb@ietf.org
>> Subject: Re: [rtcweb] PPID, UTF-16 and DOMString (Re: RTCWEB Data Channel:
>> Usage of PPID for protocol multiplexing)
>>
>> I don't think that we need any complication here.  It's a string.
>>
>> Strings are UTF-8 on the wire.
>>
>> Strings are UTF-16 (mostly) in JavaScript.
>>
>> Anything else would generate pain.
>>
>> On 7 February 2014 14:01, Makaraju, Maridi Raju (Raju)
>> <Raju.Makaraju@alcatel-lucent.com> wrote:
>>>>>> Also, I think "DOMString" PPID is too specific to Javascript, instead
>> it
>>>> should probably have a generic name like "UTF-16 String". The send API
>> can
>>>> still use DOMString as this is Javascript API.
>>>>> Any comments from others?
>>>> Note: The WebSockets protocol defines the transferred strings as UTF-8.
>>>> http://tools.ietf.org/html/rfc6455#section-5.6
>>>>
>>>> As far as I can tell, we've always intended to follow that example.
>>>>
>>>> The fact that Javascript implementations currently choose to represent
>>>> text strings as UTF-16 at their API is sad, but not an argument for
>>>> sending that particular text representation over the wire, or reflecting
>>>> the name in the API.
>>> [Raju] I agree that using UTF-8 is desired and more appropriate! Then,
>> should the PPID be changed from "DOMString" to "UTF-8"? Javascript based
>> apps have to use some library to do the conversion of DOMString/UTF-16 to
>> UTF-8. Alternatively, browsers can do this conversion under the APIs (send
>> and onmessage) before sending and after receiving UTF-8 PPID data.
>>> Without such conversion webrtc interworking between browsers and native
>> clients will be problematic (basically, will not work).
>>> Another option, which is more flexible, is to define PPIDs for different
>> encodings like "UTF-8", "UTF-16" (or even "base64" for binary to text
>> instead of using "binary" directly); then pass this encoding information to
>> send() and onmessage() calls, which will use these PPIDs. Passing encoding
>> information might be implicit (like Javavscript DOMString) in most languages
>> by the type of the argument to send. This way it is upto the application to
>> deal with the encoding conversions as it see fits.
>>> The latter approach is best suitable for interworking between clients of
>> similar type (web or native); but it is bit painful (to do conversions) for
>> clients of different types.
>>> I am wondering what kind of API calls native WebRTC API stacks (e.g.
>> Google Chrome's http://www.webrtc.org/webrtc-native-code-package) provide
>> for data channel send/onmessage ()? UTF-16 strings? Or UTF-8?
>>> -Raju
>>>
>>> _______________________________________________
>>> rtcweb mailing list
>>> rtcweb@ietf.org
>>> https://www.ietf.org/mailman/listinfo/rtcweb
>> _______________________________________________
>> rtcweb mailing list
>> rtcweb@ietf.org
>> https://www.ietf.org/mailman/listinfo/rtcweb
> _______________________________________________
> rtcweb mailing list
> rtcweb@ietf.org
> https://www.ietf.org/mailman/listinfo/rtcweb


-- 
Surveillance is pervasive. Go Dark.