Re: [hybi] Framing take IV

On Tue, Aug 3, 2010 at 6:33 PM, Ian Fette (イアンフェッティ) <ifette@google.com>wrote:

> On Tue, Aug 3, 2010 at 6:18 PM, Maciej Stachowiak <mjs@apple.com> wrote:
>
>>
>> On Aug 3, 2010, at 5:53 PM, Ian Hickson wrote:
>>
>> > On Wed, 4 Aug 2010, Greg Wilkins wrote:
>> >>
>> >> I think that we have reasonable consensus on something like:
>> >>
>> >>  +--------------------------------------------------+
>> >>  | frag(1) |unused(3) | opcode(4) |  Length(16)     |
>> >>  +--------------------------------------------------+
>> >>  |                      Data                        |
>> >>  +--------------------------------------------------+
>> >
>> > Why would we have a fixed length field with fragmentation rather than a
>> > variable length field?
>> >
>> > If we can have a variable width length field, do we need to support
>> > fragmentation in the first version? I could see an argument for
>> supporting
>> > fragmentation in the case of multiplexing, but without that it doesn't
>> > seem to actually gain us anything.
>>
>> I agree. I can't see any benefit to fragmentation over a variable-size
>> length field for an initial version without multiplexing. If the
>> variable-size length field is well-designed, then in the common case where
>> the message size is small it will only cost one extra branch to read the
>> length. In the rare case where the message size is large, a variable-size
>> length is easier to deal with than reassembling fragments, and easier on the
>> sending side too.
>>
>> I'm not really persuaded we need multiplexing at all until I see data
>> proving the benefits. Useful data would be:
>>
>> - Average degree of multiplexing we could likely get in real Web
>> applications, based on measured user behavior. I'd like to see this for at
>> least two different Web apps from two different vendors to make sure we are
>> not overtuning to a single vendor's setup.
>>
>>
> I don't think we would be able to get any data from a Google deployment in
> a reasonable time for this discussion, sadly.
>

We do have data on at least the average number of TCP connections created
for HTTP for pages. I think we've mentioned it before, but it was a long
time ago!
I believe that the way we measured that was that we took the Alexa top 1000
and counted the concurrent connections for each page.
It is my fervent hope that websockets wouldn't be that bad, but without
multiplexing, it seems like it is at best an HTTP analogue in terms of
connection usage (with the exception that it attempts to have long lived
connections which possibly makes it far worse!).

>
>
>> - Concrete performance benefits of multiplexing over a single TCP
>> connection vs. using multiple connections. I have yet to hear an answer that
>> involves numbers and doesn't sound handwavey.
>>
>
> For us it's the number of open sockets on the frontend, plain and simple.
> Each open socket == more kernel memory for socket buffers. Whereas, if we
> have multiplexing, I can take in a frame, see what stream it is, send it off
> to the appropriate backend and be done. Holding open multiple connections
> for a single user kills us.
>

Mike Belshe or I can quote numbers from experiments with SPDY, which is
attempting to solve another problem, but does show significant benefits by
using multiplexing.
We're deploying SPDY to clients and servers now, and are still in the
process of getting real-world data.

>
>
>>
>> - Performance cost of implementing multiplexing at the application level
>> compared to doing it at the protocol level (either in terms of more
>> connections resulting and the actual cost of that, or in terms of additional
>> overhead on the client side from using an iframe or shared worker or
>> whatever to share the connection.)
>>
>>
> At the application level we would need an <iframe> to google.com to get a
> shared worker, and then use postmessage. The iframe itself would be enough
> of a latency hit that we would consider it problematic for e.g. deployment
> on the homepage. Plus, if we wanted to multiplex sending large and small
> data, we would probably have to break up the large data at
> an application level and re-assemble it at an application level on the
> server side, and I cannot imagine that being performant. Again, I'm much
> more interested in squeezing out every last drop of performance. That's the
> whole reason for WS in my mind as opposed to existing JS APIs. If
> Multiplexing isn't part of the base API, at the very least it needs to be
> possible to do it efficiently via a protocol-level extension, and we need to
> define the mechanisms for such extensions now so that we can start doing
> experiments and get deployments and the data you request.
>
>
>> I'm open to trading off simplicity for performance, but only if I see data
>> backing up the performance claims. Every performance project should start
>> with measurement.
>>
>
I perfectly understand your concern-- I mirror that w.r.t. scalability.
Thankfully, at least with SPDY, we have data which we can speak about
publicly.

I'm happy to try (try being operative word) to find people here to do
experimentation and provide data. The trick is, of course, that it seems
that we're not wanting to wait for the amount of time it will take to get
the experiments up and analyse the data before coming to a conclusion on the
spec :(

-=R

>> Regards,
>> Maciej
>>
>> _______________________________________________
>> hybi mailing list
>> hybi@ietf.org
>> https://www.ietf.org/mailman/listinfo/hybi
>>
>
>
> _______________________________________________
> hybi mailing list
> hybi@ietf.org
> https://www.ietf.org/mailman/listinfo/hybi
>
>