Re: [hybi] thewebsocketprotocol #40 (new): Clarify binary/utf-8 mixed handling
Gabriel Montenegro <Gabriel.Montenegro@microsoft.com> Thu, 10 February 2011 09:47 UTC
Return-Path: <Gabriel.Montenegro@microsoft.com>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 732083A6940 for <hybi@core3.amsl.com>; Thu, 10 Feb 2011 01:47:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.839
X-Spam-Level:
X-Spam-Status: No, score=-9.839 tagged_above=-999 required=5 tests=[AWL=0.008, BAYES_00=-2.599, J_CHICKENPOX_55=0.6, RCVD_IN_DNSWL_HI=-8, SARE_SUB_ENC_UTF8=0.152]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RvhLIz9VfdRz for <hybi@core3.amsl.com>; Thu, 10 Feb 2011 01:47:13 -0800 (PST)
Received: from smtp.microsoft.com (mailb.microsoft.com [131.107.115.215]) by core3.amsl.com (Postfix) with ESMTP id 3E3953A6933 for <hybi@ietf.org>; Thu, 10 Feb 2011 01:47:13 -0800 (PST)
Received: from TK5EX14HUBC101.redmond.corp.microsoft.com (157.54.7.153) by TK5-EXGWY-E802.partners.extranet.microsoft.com (10.251.56.168) with Microsoft SMTP Server (TLS) id 8.2.176.0; Thu, 10 Feb 2011 01:47:24 -0800
Received: from TK5EX14MLTW651.wingroup.windeploy.ntdev.microsoft.com (157.54.71.39) by TK5EX14HUBC101.redmond.corp.microsoft.com (157.54.7.153) with Microsoft SMTP Server (TLS) id 14.1.270.2; Thu, 10 Feb 2011 01:47:24 -0800
Received: from TK5EX14MBXW605.wingroup.windeploy.ntdev.microsoft.com ([169.254.5.102]) by TK5EX14MLTW651.wingroup.windeploy.ntdev.microsoft.com ([157.54.71.39]) with mapi; Thu, 10 Feb 2011 01:47:24 -0800
From: Gabriel Montenegro <Gabriel.Montenegro@microsoft.com>
To: Bjoern Hoehrmann <derhoermi@gmx.net>, "g_e_montenegro@yahoo.com" <g_e_montenegro@yahoo.com>
Thread-Topic: [hybi] thewebsocketprotocol #40 (new): Clarify binary/utf-8 mixed handling
Thread-Index: AQHLyL3zrxOYNTkuIkOg6Nb3OoxxI5P6fJmA
Date: Thu, 10 Feb 2011 09:47:22 +0000
Message-ID: <CA566BAEAD6B3F4E8B5C5C4F61710C1126E0501B@TK5EX14MBXW605.wingroup.windeploy.ntdev.microsoft.com>
References: <063.e489b6d352cc1192d00acf7f96150ea7@tools.ietf.org> <buc6l61vlv7fh3s8nmu335g3d7897pcf0r@hive.bjoern.hoehrmann.de>
In-Reply-To: <buc6l61vlv7fh3s8nmu335g3d7897pcf0r@hive.bjoern.hoehrmann.de>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: "hybi@ietf.org" <hybi@ietf.org>
Subject: Re: [hybi] thewebsocketprotocol #40 (new): Clarify binary/utf-8 mixed handling
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 10 Feb 2011 09:47:14 -0000
Yes, if one accepts UTF-8, that must be validated, no question about it. The point is not to disallow UTF-8, but to be able to optimize on those websocket connections that use binary only, so this is not a fragmenting of the protocol any more than any other per-session negotiable feature is. >From the comments, folks seem to be thinking about Javascript Websockets API, and how it does not allow binary nor support streaming. That's true, but as we've emphasized before: we're designing the websockets protocol to be usable with the JS Websockets API, but is not limited to it. Other APIs may offer both binary and streaming. > -----Original Message----- > From: hybi-bounces@ietf.org [mailto:hybi-bounces@ietf.org] On Behalf Of > Bjoern Hoehrmann > Sent: Wednesday, February 09, 2011 17:00 > To: g_e_montenegro@yahoo.com > Cc: hybi@ietf.org > Subject: Re: [hybi] thewebsocketprotocol #40 (new): Clarify binary/utf-8 mixed > handling > > * hybi issue tracker wrote: > > Additionally, when only partial frames may be available, it is > > expensive to verify that this is indeed a valid UTF-8 stream (protocol > > implementation needs to take into account multi-byte characters and > > end of current data payload). If binary has been negotiated for this > > session, processing can be optimized accordingly. > > If you do accept UTF-8 encoded data then you have to validate it, other- wise > you get strange and possibly dangerous failures if you receive mal- formed data, > for instance, you can't trust that the length has been cal- culated correctly. > Anyway, you would seem to have this problem due to fragmentation anyway if > you accept text frames, and if you don't mean to accept text frames then you > just don't, there would seem to be no need to negotiate for "binary-only". I also > note that validating UTF-8 is not really expensive, it's just a matter of `state = > table[state + byte]` for each byte. Work, sure, but not very "expensive". It's easy > too if you use my http://bjoern.hoehrmann.de/utf-8/decoder/dfa/ decoder for > it. > > > Proposal: allow negotiation to clarify if a stream will not mix binary > > and text in order to enable optimizing for the binary-only case. > > I do agree the protocol specification needs to discuss mixing frame types, like, > what if you have a fragmented text message but one of the frames is not a text > frame, but allowing to negotiate this complexity away will most likely lead to > interoperability and security problems, as people will take shortcuts like not > validating the frame type. We've in fact seen that already on this list. Essentially > negotiation binary- only would be subsetting and fragmenting the protocol, and > I don't think the benefit here warrants that. > -- > Björn Höhrmann · mailto:bjoern@hoehrmann.de · http://bjoern.hoehrmann.de > Am Badedeich 7 · Telefon: +49(0)160/4415681 · http://www.bjoernsworld.de > 25899 Dagebüll · PGP Pub. KeyID: 0xA4357E78 · http://www.websitedev.de/ > _______________________________________________ > hybi mailing list > hybi@ietf.org > https://www.ietf.org/mailman/listinfo/hybi
- [hybi] thewebsocketprotocol #40 (new): Clarify bi… hybi issue tracker
- Re: [hybi] thewebsocketprotocol #40 (new): Clarif… hybi issue tracker
- Re: [hybi] thewebsocketprotocol #40 (new): Clarif… Bjoern Hoehrmann
- Re: [hybi] thewebsocketprotocol #40 (new): Clarif… Gabriel Montenegro
- Re: [hybi] thewebsocketprotocol #40 (new): Clarif… Bjoern Hoehrmann
- Re: [hybi] thewebsocketprotocol #40 (new): Clarif… John Tamplin
- Re: [hybi] thewebsocketprotocol #40 (new): Clarif… Gabriel Montenegro
- Re: [hybi] thewebsocketprotocol #40 (new): Clarif… John Tamplin
- Re: [hybi] thewebsocketprotocol #40 (closed): Cla… hybi issue tracker