Re: [hybi] Review of draft-ietf-hybi-thewebsocketprotocol-13

Tobias Oberstein <tobias.oberstein@tavendo.de> Tue, 06 September 2011 16:02 UTC

Return-Path: <tobias.oberstein@tavendo.de>
X-Original-To: hybi@ietfa.amsl.com
Delivered-To: hybi@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6892C21F8C3A; Tue, 6 Sep 2011 09:02:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.498
X-Spam-Level:
X-Spam-Status: No, score=-2.498 tagged_above=-999 required=5 tests=[AWL=0.101, BAYES_00=-2.599]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 7H7YzT-cWVM1; Tue, 6 Sep 2011 09:02:36 -0700 (PDT)
Received: from EXHUB020-3.exch020.serverdata.net (exhub020-3.exch020.serverdata.net [206.225.164.30]) by ietfa.amsl.com (Postfix) with ESMTP id B3D5821F8C37; Tue, 6 Sep 2011 09:02:36 -0700 (PDT)
Received: from EXVMBX020-12.exch020.serverdata.net ([169.254.3.209]) by EXHUB020-3.exch020.serverdata.net ([206.225.164.30]) with mapi; Tue, 6 Sep 2011 09:04:23 -0700
From: Tobias Oberstein <tobias.oberstein@tavendo.de>
To: "Richard L. Barnes" <rbarnes@bbn.com>
Date: Tue, 06 Sep 2011 09:03:19 -0700
Thread-Topic: AW: [hybi] Review of draft-ietf-hybi-thewebsocketprotocol-13
Thread-Index: AcxsqPwBgz2CYo7NTMmkn4vuPkYuuQABH+cQ
Message-ID: <634914A010D0B943A035D226786325D422C0EB8DC1@EXVMBX020-12.exch020.serverdata.net>
References: <942CCA6B-B784-441B-96CA-3506FFC439E1@bbn.com> <CALiegfmyQ5h4S2FgBnrh2VLr8+q-h0sLiGsww7T+1VwYNRo4wQ@mail.gmail.com> <72E40A0F-C923-472F-9534-538B89F7A444@bbn.com> <634914A010D0B943A035D226786325D422C0EB8D18@EXVMBX020-12.exch020.serverdata.net> <365A444D-CF7D-41A0-A446-7306DE4CDBBC@bbn.com>
In-Reply-To: <365A444D-CF7D-41A0-A446-7306DE4CDBBC@bbn.com>
Accept-Language: de-DE, en-US
Content-Language: de-DE
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: de-DE, en-US
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: General Area Review Team <gen-art@ietf.org>, "hybi@ietf.org" <hybi@ietf.org>
Subject: Re: [hybi] Review of draft-ietf-hybi-thewebsocketprotocol-13
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Sep 2011 16:02:37 -0000

> > When a frame does not end on code point boundary, one needs to
> > remember at most 3 bytes to continue validation on next frame.
> 
> If frames are valid utf-8, then you don't need to keep any state (on either
> end of the connection).

You do need to keep at least the opcode of the first message frame, since
it determines the type of the whole message and the continuation state,
since you need to identify a protocol violation when you receive a
continuation frame when there is nothing to continue or a frame with FIN=1
but opcode != 0 when a continuation was expected.

Thats 4 bits, but you need state.

You also need state even for unfragmented messages whenever the
frame size exceed the amount which can buffered (and a streaming API
is in place).

> > It would make sense that a peer SHOULD fail a connection upon invalid
> > UTF-8 as soon as it is possible - that means with at most 1 frame
> > delay upon the start of the byte sequence that was invalid UTF-8.
> >
> > Anyway: what's the advantage of such an requirement?
> 
> The advantage is frame-wise validation instead of message-wise validation.
> As you point out, it's not a huge distinction, more "be conservative in what
> you send".  It just seems unnecessarily sloppy not to have frame boundaries
> coincide with code point boundaries.

How do you validate a frame of 2^63 octets?

Message-based/frame-based validation does not help. One needs incremental
/streaming validation.

And if you have incremental validation, there is no need to have frame boundaries
on whole code points. The incremental validator just keeps it's internal state
(of at most 3 bytes) and you feed it the next chop of octets untils it bails "invalid
UTF-8", upon which the connection is immediately failed.

I really don't get the problem ..