Re: [hybi] Framing take IV

Jamie Lokier <jamie@shareable.org> Fri, 06 August 2010 04:11 UTC

Return-Path: <jamie@shareable.org>
X-Original-To: hybi@core3.amsl.com
Delivered-To: hybi@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id EE9563A66B4 for <hybi@core3.amsl.com>; Thu, 5 Aug 2010 21:11:38 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.513
X-Spam-Level:
X-Spam-Status: No, score=-2.513 tagged_above=-999 required=5 tests=[AWL=0.086, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XeYd4lYawmNb for <hybi@core3.amsl.com>; Thu, 5 Aug 2010 21:11:36 -0700 (PDT)
Received: from mail2.shareable.org (mail2.shareable.org [80.68.89.115]) by core3.amsl.com (Postfix) with ESMTP id 6062B3A6862 for <hybi@ietf.org>; Thu, 5 Aug 2010 21:11:35 -0700 (PDT)
Received: from jamie by mail2.shareable.org with local (Exim 4.63) (envelope-from <jamie@shareable.org>) id 1OhEHs-00013n-UR; Fri, 06 Aug 2010 05:12:04 +0100
Date: Fri, 06 Aug 2010 05:12:04 +0100
From: Jamie Lokier <jamie@shareable.org>
To: Greg Wilkins <gregw@webtide.com>
Message-ID: <20100806041204.GM27827@shareable.org>
References: <AANLkTinyrDoG5d_Ur6HVRy=SgMPjLzJtpJ++Ye=1DQdj@mail.gmail.com> <20100804022719.GT27827@shareable.org> <AANLkTi=MENta8H4A_ota=R==EJ3j0zAkPc7ai2qmsZiT@mail.gmail.com> <AANLkTinZE8-HSi-BJD8Oq3z3+9BXY8eMnZ4DAnOaiuT=@mail.gmail.com> <20100804031917.GV27827@shareable.org> <AANLkTikMuVDwUyKYetCvj2GWv8dW+sWa5RLOVVOVPcBF@mail.gmail.com> <20100805020304.GY27827@shareable.org> <AANLkTi=HHNhWsg8by3p-F_V-dctH+hv=c44qUGRiWF7P@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <AANLkTi=HHNhWsg8by3p-F_V-dctH+hv=c44qUGRiWF7P@mail.gmail.com>
User-Agent: Mutt/1.5.13 (2006-08-11)
Cc: hybi@ietf.org
Subject: Re: [hybi] Framing take IV
X-BeenThere: hybi@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Server-Initiated HTTP <hybi.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/hybi>
List-Post: <mailto:hybi@ietf.org>
List-Help: <mailto:hybi-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/hybi>, <mailto:hybi-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 06 Aug 2010 04:11:40 -0000

Greg Wilkins wrote:
> Jamie,
> sorry the text diagrams are not working... they should be in fixed width fonts.

Some trouble with Unicode, I think.  My poor terminal seems to be not
displaing non-ASCII characters usefully lately.

> Anyway - I'm trying to work out the exact nature of your proposal other than
> what I think are stylistic (eg opcode + length  vs BERlength + opcode )

Agreed about style being unimportant.

> So I think the key aspect of your  proposal I think is that the
> extended op codes are negotiated in the handshake.

Yup, that's the best part!

> You say that the *client* decides opcode values for features it
> proposes, but how does the client know what features the server
> supports?

The client doesn't know, and it doesn't need to know.

The client is defining which opcodes it will use, so that it can use
them before the server has responded in negotiation.

The server ignores frames whose opcode it doesn't recognise.

The client does not know which messages the server will act on, and
which it will ignore, until the server has responded.  That's ok, it
just means only fancy clients would use this feature; simple ones
would wait for the server.

The details of handshake etc. aren't that important: This mechanism
applies whether it's a embedded in the TLS phase, in the HTTP-like
handshake, after handshake prior to server response, or even later.

The point is, if it's done this way, there's a basis whereby
latency-avoiding setup costs can be avoided, even if actually doing so
is reserved for the future and an extension.

If it's the server which decides the opcodes from the start, then even
future extensions won't have any way to encode latency-avoiding
initial messages from the client.

Note that if any feature is proposed by the server and needs to wait
for the client to confirm, the server should define the opcode in that
case.  Rule: "Proposer defines opcode(s)".

> Surely it is best for the client to simply list the extensions it may
> support and then the server picks the final list and allocates opcodes
> (and/or bits)?

No, because that prevents sending any message until the server has
responded.  It adds an unnecessary round trip delay.  There is no cost
to the client choosing the values instead of the server, and it is
more versatile regarding latency avoidance in future.

This is a general principle of dynamic protocol design to avoid round
trips.  When defining something to use, the first sender/proposer
should do the defining, and just use it, not wait for agreement.
Multiplexing should do similar: Allow the first sender to create a
stream and use it immediately.  If creation fails, it will find out,
and the messages sent will just be ignored.  (The designer of the X
windowing protocol noted this as a design mistake in that one... it
makes X applications start slower)

> Could you possible do some more worked examples of how you see this
> working - eg if I wanted to negotiate fragments, mime messages and
> compressed frames.

I think you're already got the basic idea.  Client sends negotiation
text which defines opcodes, and uses them after negotiation.

Note that length in all cases includes the frame's opcodes, parameters
etc. so that whole frames are skippable, as well as easily forwarded.

Negotiated fragments are simple enough: Client doesn't send fragment
opcodes until it's negotiated with the server.  It can only send whole
messages.  (Or client may send them prior to that, but on learning
that the server doesn't support them, it'll have to resend the
pre-sent content in whole messages only.  How much to pre-send is an
implementation detail, and would tend to vary over the years according
to typical server features.)  Fragments/chunks:

    length1 chunk-opcode final=0 message-opcode data1
    length2 chunk-opcode final=0 data2
    length3 chunk-opcode final=1 data3

Being equivalent to:

    (length1+length2+length3-adjustment) message-opcode data1+data2+data3

(The message-opcode is actually just part of the data, as far as the
chunks are concerned.  When concatenated together, it is interpreted
as a message).

Maybe final could always have no data; it's a stylistic detail.

Mime messages:

    length mime-opcode mime-headers data

But I see you're going to want split MIME frames, so like this:

    length1 chunk-opcode final=0 mime-opcode mime-headers data1
    length2 chunk-opcode final=0 data2
    length3 chunk-opcode final=1 data3

A split could occur inside the mime-headers, naturally.  Or even after
the opcode; as far as chunks are concerned, the whole body is an
opaque blob to reassamble and then interpret.

Compressed frames:

    length lzo-opcode lzo(data)

If you wanted a CRC and incremental LZO compression together, where
the CRC is of the compressed data:

    length crc32-opcode lzo-opcode lzo(data) crc32(lzo(data))

This begs a question: CRC and compress the message before splitting,
or each chunk?  Either way around is permissible in general, but
specific opcode definitions may mandate only one order is acceptable
use of them, and multiplexing and chunking specifically do that.

Compressed chunks looks a bit complicated.  At risk of putting people
off, this is a whole message compressed, and then the compressed
version is split:

    length1 chunk-opcode final=0 lzo-opcode lzo(data)1
    length2 chunk-opcode final=0 lzo(data)2
    length3 chunk-opcode final=1 lzo(data)3

This is a message split first, and then each chunk compressed:

    length1 lzo-opcode chunk-opcode final=0 lzo(data1)
    length2 lzo-opcode chunk-opcode final=0 lzo(data2)
    length3 lzo-opcode chunk-opcode final=1 lzo(data3)

CRCs could be combined with compression, and various flavours of
compression work.  Both incremental compression (of the whole stream)
and individual message or chunk compression, and switching off
compression for specific chunks.

It should be clear that this scheme accomodates both "chunk" opcodes
and "message" opcodes without having to define the difference.

Now multiplexing:

Multiplex encoding must be fast at identifying the stream id and
length for fast forwarding.  So the definition of multiplex-opcode
will say that it can only be used as an "outer" opcode (we might
permit it inside compression and/or checksums, maybe).

Multiplexing is extremely simple:

    length multiplex-opcode stream-id ....

Where .... is any other message- or chunk-opcode and its
parameters/data.  It really can't get much simpler.  That doesn't
cover any per-stream setup negotiation or flow control.  I'm sure this works:

    length multiplex-setup-opcode new-stream-id parameters....
    length multiplex-setup-response-opcode new-stream-id response...
    length multiplex-flow-opcode stream-id amount

With a good extension mechanism, an interesting question is what's the
_simplest_ base protocol that can be implemented and will still be
fully compatible with the most featureful future implementations.

Probably that is just this, and nothing more, in a simplest implementation:

    length text-utf8-opcode data

Or this:

    length basic-data-opcode data

Thus fragmentation, multiplexing, compression and MIME would all be
optional extensions, and all implementations of them would be expected
to work with any implementation including the most basic, if it's
meaningfully possible.

That way, we cover a wide spectrum of implementations from "amateur
programmer", to "person who can program but right now wants to write a
short Perl script", to "lead programmer at Google Wave".  We get good
scalability, it's future-extensible, and everything across the
spectrum will interoperate beautifully.  Like HTTP but better.(*)

(*) Free software comes without warranty etc.

> Eitherway, if opcodes were able to be negotiated, then would you agree
> that this would work with the binary framing more or less how it is
> currently
> specified;
> 
>     op-code(8) + BER length + data

Yes.

> I ask this because the current framing is our starting point.  I know
> there are arguments for
>     opcode + fixed length
>     opcode + BER length
>     BER opcode + BER length
>     fix length + opcode
>     BER length + opcode
>     BER length + BER opcode
> 
> but I think they are secondary to the basic extension mechanism and
> I'd really like to separate out discussion of what is the best length
> encoding from what is a sufficient extension mechanism.

I agree, separating detailed style/range from the fields is sensible.

The order of fields just isn't very important, although I could make a
case for "fixed fields", "length", "variable fields including
opcode(s)", and Dave Cridland makes a good case for stream id being in
an easily found place, for fast forwarding.

The range arguments affect things like: Whether parsing split messages
needs to be mandatory in the most basic protocol (for simple implementations).

For opcodes, we could, maybe, reserve opcode bytes >= 0x80, and put
off deciding if that starts a BER or not ;-) as long as the length
comes before it.  It'll just be an ignored frame either way.

-- Jamie