Re: [hybi] Review of draft-ietf-hybi-thewebsocketprotocol-13

Greg Wilkins <gregw@intalio.com> Wed, 07 September 2011 01:23 UTC

Return-Path: <gregw@intalio.com>
X-Original-To: hybi@ietfa.amsl.com
Delivered-To: hybi@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9099821F8DBC; Tue, 6 Sep 2011 18:23:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.909
X-Spam-Level:
X-Spam-Status: No, score=-2.909 tagged_above=-999 required=5 tests=[AWL=0.068, BAYES_00=-2.599, FM_FORGED_GMAIL=0.622, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id El5sLyD-ndhY; Tue, 6 Sep 2011 18:23:31 -0700 (PDT)
Received: from mail-vx0-f172.google.com (mail-vx0-f172.google.com [209.85.220.172]) by ietfa.amsl.com (Postfix) with ESMTP id C8B6421F8D56; Tue, 6 Sep 2011 18:23:30 -0700 (PDT)
Received: by vxi29 with SMTP id 29so376992vxi.31 for <multiple recipients>; Tue, 06 Sep 2011 18:25:17 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.52.172.174 with SMTP id bd14mr5901957vdc.246.1315358717264; Tue, 06 Sep 2011 18:25:17 -0700 (PDT)
Received: by 10.52.110.133 with HTTP; Tue, 6 Sep 2011 18:25:17 -0700 (PDT)
In-Reply-To: <72E40A0F-C923-472F-9534-538B89F7A444@bbn.com>
References: <942CCA6B-B784-441B-96CA-3506FFC439E1@bbn.com> <CALiegfmyQ5h4S2FgBnrh2VLr8+q-h0sLiGsww7T+1VwYNRo4wQ@mail.gmail.com> <72E40A0F-C923-472F-9534-538B89F7A444@bbn.com>
Date: Wed, 07 Sep 2011 11:25:17 +1000
Message-ID: <CAH_y2NF2PmeYa_KBFQ3qRw5_1Ywa0AB_ar71N36Os_wHk534fg@mail.gmail.com>
From: Greg Wilkins <gregw@intalio.com>
To: "Richard L. Barnes" <rbarnes@bbn.com>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: quoted-printable
Cc: General Area Review Team <gen-art@ietf.org>, hybi@ietf.org
Subject: Re: [hybi] Review of draft-ietf-hybi-thewebsocketprotocol-13
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 07 Sep 2011 01:23:31 -0000

On 7 September 2011 00:43, Richard L. Barnes <rbarnes@bbn.com> wrote:
>>> Section 5.6, "Note that a particular text frame might include a partial UTF-8 sequence, however the whole message MUST contain valid UTF-8"
>>> This requirement is meaningless, since the concept of a "message" is not defined here.  Suggest going back to a requirement that a frame MUST contain valid UTF-8 (i.e., that it breaks at code-point boundaries).
>>
>> No please. This has been already discussed.
>>
>> Imagine I must send a very big WS UTF-8 message and due to max frame
>> size requeriments (still to know how such requiremente is
>> "negotiated") I need to split it in N frames. This feature would work
>> at the very transport core layer.
>>
>> Probably I have a function that splits the whole WS message into
>> chunks of N bytes (I mean "bytes" because I do know the max frame size
>> in *bytes*), so such function just counts N bytes from the WS message
>> and generates a frame. Please don't force such function to be
>> Unicode/UTF-8 aware, no please.
>
> Clearly it already has to be WebSocket aware, and it already has to read the opcode in order to distinguish data frames from control frames.  Adding on a requirement to break at code point boundaries does not seem hugely onerous.  It's three lines of C:

It is more difficult than that.

For example, I currently fragment frames when a buffer fills up.
During the filling I'm not looking at the bytes and don't care if it
is a text or binary frame.      If I have to fragment on a utf-8 char
boundary, then I'll have to a) handle text and binary differently b)
actually inspect the bytes rather than just bulk copy them c) deal
with residue bytes left over that could not be put into the fragement

This would be a throw it all out and start again kind of change for me.