[hybi] Framing, was Re: WebSocket feedback

Dave Cridland <dave@cridland.net> Thu, 04 March 2010 14:46 UTC

References: <8B0A9FCBB9832F43971E38010638454F032E566DDF@SISPE7MB1.commscope.com> <Pine.LNX.4.64.1002150605580.29686@ps20323.dreamhostps.com>
In-Reply-To: <Pine.LNX.4.64.1002150605580.29686@ps20323.dreamhostps.com>
MIME-Version: 1.0
Message-Id: <3812.1267714007.048622@puncture>
Date: Thu, 04 Mar 2010 14:46:47 +0000
From: Dave Cridland <dave@cridland.net>
To: Ian Hickson <ian@hixie.ch>, Server-Initiated HTTP <hybi@ietf.org>
Content-Type: text/plain; delsp="yes"; charset="us-ascii"; format="flowed"
Subject: [hybi] Framing, was Re: WebSocket feedback
Precedence: list

On Thu Mar  4 03:21:28 2010, Ian Hickson wrote:
> FRAMING
> 
> On Wed, 3 Feb 2010, Jamie Lokier wrote:
> > Greg Wilkins wrote:
> > >   + Why have two framing techniques when binary is sufficient  
> to carry
> > >     everything.
> >
> > I agree.  While I acknowledge the argument that
> >
> >     print length($text), $text
> >
> > is an invitation to do the wrong thing in some languages, I think  
> that
> > there are more ways that 0xff can lead to the wrong thing,  
> because so
> > many languages pass around "nominal UTF-8" which is not  
> guaranteed to be
> > UTF-8, making 0xff delimiting unreliable too.
> 
> It's true that if the server doesn't handle UTF-8 correctly, it can  
> end up
> outputting 0xFF bytes. In practice, this would need some out-of-band
> erroneous data (e.g. an ISO-8859-1 form submission); you couldn't  
> trigger
> this bug easily by sending data to the server and having it return  
> it
> later, for instance. The length bug, on the other hand, could occur  
> with
> no external data: sending non-ASCII data could very easily result  
> in the
> server screwing things up if we use lengths rather than delimiters.
> 
> 
This is nonsensical - I don't see how the (rare) case of someone  
failing to distinguish between character counts and octet lengths is  
going to be more common than broken or invalid UTF-8.

> For scalability it's probably ideal if we can use lengths, but on  
> the long
> run for environments where that matters we'll probably just use  
> compressed
> frames which would be binary anyway (and thus length-encoded), so  
> this
> will probably become a non-issue in that kind of environment.

Except you've already thrown away binary framing.

> > >   + Who controls allocation of the frame type byte?  So far  
> every
> > >     suggestion of usage for that (eg a bit to indicate that the
> > >     frame contains meta-data headers) has been rejected.  So are
> > >     binary users simply to pick their own bytes and hope for no
> > >     collisions?  Will IANA eventually allocate values?  is 7  
> bits
> > >     enough?
> >
> > There will be no collisions for frame bytes which depend on the
> > sub-protocol name, as those frame bytes are privately agreed  
> between
> > client and server.
> 
> If a client and a server are speaking a specific sub-protocol, they  
> don't
> actually have to even use WebSockets -- they can just use a  
> protocol that
> happens to look like WebSockets but defines whatever frame types  
> they
> want.
> 
> If the client is a browser, and thus they're speaking generic  
> WebSockets,
> then the frame types would be just those supported by the API, and  
> there
> wouldn't be any custom types. So extensions would just be  
> "registered" by
> revving the protocol.

If we ever need to use frame types as part of the protocol (for  
chunking, to give one discussed example), then we *will* need to  
consider standardized frame types.

I'd be happy with, at this point, mere reservation of a range for  
protocol purposes, and leaving a range clear for subprotocol usage.




> > > [...] users can't be trusted to always provide valid utf-8  
> data, so if
> > > user data is not validated then sentinel encoding allows frame
> > > injection attacks.  After all we have learnt with HTTP, it  
> seams silly
> > > to repeat the mistake of a protocol that is exposed to such  
> attacks
> >
> > If you have several bits of code sharing a connection - even if  
> it's
> > just by sharing a common Javascript framework on a single page,  
> then you
> > have security issues from this.
> 
> You certainly have security issues, but you don't have this issue  
> (user
> data containing 0xFF), because the protocol requires the browser to  
> make
> the data be valid UTF-8, and so you'll never be able to inject a  
> 0xFF byte
> from the client.
> 
> Of course, if the server has other sources of data, and it doesn't
> validate them, then it's possible you'll have this problem in the  
> other
> direction.
> 
> 
Up above, you implied this was rare.

> >      frame         = text-frame / binary-frame
> >      binary-frame  = %x80 length *%x00-FF
> >      length        = %x00 / %x01-7f / ( %x81-FF *%x80-FF %x00-7F )
> >
> > This is a canonical length encoding in that it doesn't allow for  
> leading
> > zeroes.
> 
> Leading zeros are not disallowed. (There wouldn't be much point
> disallowing them as far as I can tell.)

Yes, because multiple representations of the same values has *never*  
caused a security issue.

> On Tue, 2 Feb 2010, Greg Wilkins wrote:
> > >>
> > >> The length encoding currently allows for a length of 0x80 0x80  
> 0x80
> > >> .... to be sent forever.  This is a nonsense length, but could  
> be
> > >> used for DOS attacks on servers.  I think the 0x80 value  
> should be
> > >> explicitly defined as an error if given as the first byte of a
> > >> length.
> > >
> > > How would this be different than sending 0x81 forever?
> >
> > Sending 0x81 forever should also be caught, but by the server  
> detecting
> > an overflow of the accumulating length byte.
> 
> Ok, how is it different from just sending infinite text in a 0x00  
> frame?
> Or, in HTTP, sending an HTTP header with an infinite value?

There is no difference, which is yet another reason why that's a  
silly framing style.

I still have no clear reason why a simple network-order 32-bit  
integer isn't acceptable, especially given the ideas on chunking.

> > Again, the point is - if the algorithm style is meant to convey  
> all the
> > error handling, then it needs to be thorough.
> 
> That's not an error, just an inefficient way of encoding the length.
> 
> The point isn't to catch ever error, but that the processing be  
> defined
> for all inputs. It doesn't matter how errors are handled, so long  
> as they
> are handled.

That's an awesome quote.

So, your defined behaviour in the face of a DOS or overflow attack is  
to, what? Just work?

Dave.
-- 
Dave Cridland - mailto:dave@cridland.net - xmpp:dwd@dave.cridland.net
  - acap://acap.dave.cridland.net/byowner/user/dwd/bookmarks/
  - http://dave.cridland.net/
Infotrope Polymer - ACAP, IMAP, ESMTP, and Lemonade

Re: [hybi] WS ABNF Dave Cridland
Re: [hybi] WS ABNF Julian Reschke
Re: [hybi] WS ABNF Greg Wilkins
Re: [hybi] WS ABNF Thomson, Martin
[hybi] WS ABNF Thomson, Martin
Re: [hybi] WS ABNF Dave Cridland
Re: [hybi] WS ABNF Julian Reschke
Re: [hybi] WS ABNF Jamie Lokier
Re: [hybi] WS ABNF Pieter Hintjens
Re: [hybi] WS ABNF Dave Cridland
Re: [hybi] WS ABNF Dave Cridland
Re: [hybi] WS ABNF Greg Wilkins
Re: [hybi] WS ABNF Scott Ferguson
Re: [hybi] WS ABNF Dave Cridland
Re: [hybi] WS ABNF Scott Ferguson
Re: [hybi] WebSocket feedback Thomson, Martin
[hybi] WebSocket feedback Ian Hickson
Re: [hybi] WebSocket feedback Ian Hickson
Re: [hybi] WebSocket feedback Greg Wilkins
Re: [hybi] WebSocket feedback Vladimir Katardjiev
Re: [hybi] WebSocket feedback Greg Wilkins
Re: [hybi] WebSocket feedback Greg Wilkins
Re: [hybi] WebSocket feedback Vladimir Katardjiev
[hybi] Publishing drafts, Re: WebSocket feedback Julian Reschke
Re: [hybi] Publishing drafts, Re: WebSocket feedb… Julian Reschke
[hybi] Framing, was Re: WebSocket feedback Dave Cridland
Re: [hybi] WebSocket feedback Greg Wilkins
Re: [hybi] WebSocket feedback Vladimir Katardjiev
Re: [hybi] WebSocket feedback Greg Wilkins
Re: [hybi] WebSocket feedback Joe Hildebrand
Re: [hybi] WebSocket feedback Greg Wilkins
Re: [hybi] WebSocket feedback Greg Wilkins
Re: [hybi] WebSocket feedback Julian Reschke
Re: [hybi] WebSocket feedback Mridul Muralidharan
[hybi] requirement: backwards compatible?. was : … Greg Wilkins
Re: [hybi] requirement: backwards compatible?. wa… Anne van Kesteren
Re: [hybi] requirement: backwards compatible?. wa… Vladimir Katardjiev