Re: [hybi] Fragmented text message

David Endicott <dendicott@gmail.com> Thu, 21 July 2011 16:11 UTC

Return-Path: <dendicott@gmail.com>
X-Original-To: hybi@ietfa.amsl.com
Delivered-To: hybi@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DE8F421F898F for <hybi@ietfa.amsl.com>; Thu, 21 Jul 2011 09:11:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.584
X-Spam-Level:
X-Spam-Status: No, score=-3.584 tagged_above=-999 required=5 tests=[AWL=0.014, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id c6VsSGHjl0it for <hybi@ietfa.amsl.com>; Thu, 21 Jul 2011 09:11:05 -0700 (PDT)
Received: from mail-ww0-f44.google.com (mail-ww0-f44.google.com [74.125.82.44]) by ietfa.amsl.com (Postfix) with ESMTP id A75DD21F8B18 for <hybi@ietf.org>; Thu, 21 Jul 2011 09:11:04 -0700 (PDT)
Received: by wwe5 with SMTP id 5so951514wwe.13 for <hybi@ietf.org>; Thu, 21 Jul 2011 09:11:03 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=TuFVxLIWow/ReEcRYMAXm2K8eGYktEcSxlvwGVQ+lOU=; b=GXI0K5E/15ADK23MzwZLvYOkyfxS7Gm2mfS3JOyzZcXI8tyma6nyHIJVTRNTezteP3 x39F9LiTAA+TaS2ahck0CUWR8q3lt7tkbYgkczE1G1uHxh9nL83g2iN4JvYXvrRBth8K vXPpIAwxsIco1sx+p/3nG4RWgqKraADAW4+lg=
MIME-Version: 1.0
Received: by 10.216.79.18 with SMTP id h18mr419198wee.3.1311264663705; Thu, 21 Jul 2011 09:11:03 -0700 (PDT)
Received: by 10.216.39.197 with HTTP; Thu, 21 Jul 2011 09:11:03 -0700 (PDT)
In-Reply-To: <ED13A76FCE9E96498B049688227AEA29388ADF4D@TK5EX14MBXC206.redmond.corp.microsoft.com>
References: <EC24CA2C319E8D47ACA5E181ABEC3E7B13BA5205BB@MCHP058A.global-ad.net> <CAE8AN_UmK-r2OskQG+QuRPgAWOg7S0BN6vfKLyDPPp2fAFDReQ@mail.gmail.com> <CAH9hSJYMJbswzpsnEmDz1CLF6bAKQQ954xyzrJ6=T1t4DoW4uw@mail.gmail.com> <ED13A76FCE9E96498B049688227AEA29388ADF4D@TK5EX14MBXC206.redmond.corp.microsoft.com>
Date: Thu, 21 Jul 2011 12:11:03 -0400
Message-ID: <CAP992=Gh8ZQzHnNQ==Z-oPoR=dcxwE6JcHBVmwNwBrnViCAttw@mail.gmail.com>
From: David Endicott <dendicott@gmail.com>
To: Piotr Kulaga <piotrku@microsoft.com>
Content-Type: multipart/alternative; boundary=000e0ce0b4f07c9a2904a89698d5
Cc: "hybi@ietf.org" <hybi@ietf.org>
Subject: Re: [hybi] Fragmented text message
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 21 Jul 2011 16:11:06 -0000

Would that not then require that the websocket peer and any websocket aware
intermediaries examine and understand the content of the frames?   (ie. to
be able to tell if a UTF sequence is being split).

That seems, to me, to be highly impractical.  Or even impossible.

Also, I consider WS to be transport protocol and as such, it must be content
agnostic.

So long as the endpoints deliver via their API's a valid UTF-8 stream, what
does it matter if it's scattered on the wire?


On Thu, Jul 21, 2011 at 12:01 PM, Piotr Kulaga <piotrku@microsoft.com>wrote;wrote:

>  My understanding is that each websocket entity that uses UTF-8 must have
> a valid UTF-8 stream as a payload. This covers all UTF-8 frames including
> continuation frames.****
>
> ** **
>
> I slightly more prefer approach where UTF-8 message must contain valid
> sequence rather than each continuation frame (simpler for intermediaries,
> endpoints that stream data to application still must handle partial UTF-8
> code point encoding case). Fine with any approach as long as it is well
> defined.****
>
> ** **
>
> *From:* hybi-bounces@ietf.org [mailto:hybi-bounces@ietf.org] *On Behalf Of
> *Takeshi Yoshino
> *Sent:* Thursday, July 21, 2011 2:22 AM
> *To:* Brian
> *Cc:* hybi@ietf.org
> *Subject:* Re: [hybi] Fragmented text message****
>
> ** **
>
> Agreed. Nothing in the spec disallows splitting UTF-8 byte sequence into
> separate frames.****
>
> ** **
>
> Impatient receivers must use some intelligent UTF-8 decoder like Brian
> explained to get code points decoded. For example in python,
> codecs.getincrementaldecoder('utf-8')() does this. Maybe most of major
> platforms have library like it.****
>
> ** **
>
> We may also add a constraint that the text message must be fragmented at
> UTF-8 byte sequence boundary, but it complicates fragmentation code. I'm not
> for that.****
>
> ** **
>
> On Thu, Jul 21, 2011 at 17:47, Brian <theturtle32@gmail.com> wrote:****
>
> Unless I'm mistaken, the fragmentation may occur in the middle of a
> multi-byte character sequence.  Your code should be aware of that when
> decoding.  My initial implementation buffers all fragments and then decodes
> the whole message into a string at once.  I imagine you could probably
> inspect the last four bytes of a fragment to determine whether there's a
> partial utf-8 character.  If there is, you could buffer just those few bytes
> and decode the rest of the fragment.  Then when the next fragment comes in,
> prepend those bytes to the new payload and continue. Depending on your use
> case and what you're optimizing for, it may be more efficient to just buffer
> the whole message and then decode.****
>
> ** **
>
> Brian****
>
> ** **
>
> On Thu, Jul 21, 2011 at 1:28 AM, Kukosa, Tomas <
> tomas.kukosa@siemens-enterprise.com> wrote:****
>
> If the text message is fragmented must be each fragment a valid UTF-8
> string or only complete defragmented message must be a valid UTF-8 string?
> I.e. may I during receiving decode each fragment by UTF-8 and than join
> strings or do I need to receive all fragments and then decode only
> defragmented message?****
>
> _______________________________________________
> hybi mailing list
> hybi@ietf.org
> https://www.ietf.org/mailman/listinfo/hybi
>
>