Re: [arch-d] HTTP 2.0 + Performance improvement Idea

>the way you initiated your system may be causing the out-of-order
responses.

The main intention was to have a complete file lock to the reader and have
these X workers do the job without having to worry about the order in which
these workers return.

>So let's assume you could transmit these blocks in the order they return
from reads. Have you considered the impact on the receive side?

Yes. I agree to your point that at the client side one must have enough
resources to hold the out of order buffers without affecting other
applications.

So i wanted to know if its a good trade off between utilizing more
resources at the client/Host side than on the server side while trying to
achieve parallelism. What is your take on this and also how can i go ahead
with this from here.

Thanks,
Rakshith

On Wed, Mar 26, 2014 at 6:51 PM, Joe Touch <touch@isi.edu> wrote:

>
>
> On 3/25/2014 11:20 PM, Rakshith Venkatesh wrote:
>
>> I think i should be explaining what was done and what we saw as
>> improvements. What i meant was lets say the block size of a file system
>> is 64kB. Now, if a file is lets say 512GB, in the normal world, i would
>> fetch 64kB of data sequentially until i read all of data. Now every 64KB
>> data sent across the wire could get chunked by TCP and the chunks would
>> get re-ordered at the client side to present the 64KB data to the
>> application in-tact. Please note that i am not referring to this
>> re-ordering as part of my idea.
>>
>>   Further, what was done was divide the whole file into 64KB zones and
>> have X workers waiting to fetch each zone. So 1 to X workers just went
>> ahead reading the first X  zones of data. Whoever came back first among
>> these X workers was assigned the task of reading the X+1 zone and so on
>> and so forth.
>>
>
> That sounds like you tried to start all X workers at the same time, but
> such events are rarely truly simultaneous.
>
> I.e., the way you initiated your system may be causing the out-of-order
> responses. Or there could be a resource interaction in which the order you
> see is a side-effect of different processes or threads interacting with
> shared locks.
>
>
>  This experiment saw a minimum performance improvement of 20-30% in terms
>> of time taken to read the file. The only thing that stopped us from
>> sending the reads across the wire was that the client expected the data
>> bytes to be in-order. i.e: if the any worker came back first before the
>> 1st worker, i could not send the data that was fetched by the that
>> worker because of the constraint that i have mentioned previously.
>>
>
> So let's assume you could transmit these blocks in the order they return
> from reads. Have you considered the impact on the receive side? I.e., you'd
> need to do the same study on the impact of out of order writes on
> performance.
>
>
>  This also led to a situation where i had to hold onto the buffer
>> resources even though i completed reading sub-portions of the file
>> because fetching  the whole file was yet to be finished.
>>
>
> Sure, but unless your send and receive side are exactly matched in terms
> of the order impact on performance, one side will end up holding buffers
> too.
>
> I.e., it sounds like you have some OS research to complete before you know
> you'll be able to take advantage of block-oriented framing in the transport
> protocol.
>
> Joe
>
>
>>
>> On Wed, Mar 26, 2014 at 10:46 AM, Joe Touch <touch@isi.edu
>> <mailto:touch@isi.edu>> wrote:
>>
>>
>>     On Mar 25, 2014, at 6:01 PM, Rakshith Venkatesh <vrock28@gmail.com
>>     <mailto:vrock28@gmail.com>> wrote:
>>
>>      > I am not sure if an NFS client can accept blocks out of order
>>     from a server. I need to check on that.
>>
>>     NFS should be matching read responses to requests; order should not
>>     matter, esp. because when using UDP the requests and responses can
>>     be reordered by the network.
>>
>>      > If it does then NFS over HTTP sounds good. The way i see it is
>>     any File service protocol such as NFS, SMB (Large MTU's), HTTP, FTP
>>     should have a way which would help the server achieve true
>>     parallelism in reading a file and not worry about sending
>>     data/blocks in-order.
>>
>>     FTP has block mode that should support any order of response, but if
>>     they're all going over the same TCP connection it can help only with
>>     the source access order rather than network reordering.
>>
>>     HTTP has chunking that should help with this (e.g., using a chunk
>>     extension field that indicates chunk offset).
>>
>>     However, most file systems are engineered to manage files via
>>     in-order access, even when parallelized, so I'm not sure how much
>>     this will all help. I.e., I don't know if your assumption is valid,
>>     that overriding the serial access will be faster.
>>
>>     It might be useful to show that you can get to the components of a
>>     file faster as you assume first.
>>
>>     Joe
>>
>>
>>      >
>>      > Rakshith
>>      >
>>      >
>>      > On Wed, Mar 26, 2014 at 2:47 AM, Joe Touch <touch@isi.edu
>>     <mailto:touch@isi.edu>> wrote:
>>      > You sound like you're looking for NFS; maybe you should propose
>>     an "NSF over HTTP" variant where responses can be issued out of order?
>>      >
>>      > Joe
>>      >
>>      >
>>      > On 3/25/2014 4:56 AM, Rakshith Venkatesh wrote:
>>      > Hi,
>>      >
>>      > I was going through SPDY draft or the HTTP 2.0 initial draft. I
>>     had an
>>      > idea which I think would require a change in the architecture of
>> the
>>      > same and so I am dropping this mail. Here is the idea:
>>      >
>>      > A Http client expects the server to send data in-order. (NOTE:
>> When I
>>      > say a server, I am referring to an appliance to which disks are
>>     attached
>>      > and the file resides on the disk.). If there is a file let's say
>> 10GB
>>      > and if the client asks for this file from the server, the data
>>     has to be
>>      > given in-order from the 1^st byte till the last byte. Now let's
>> say I
>>      >
>>      > implement an engine at the server side to do a parallel read on
>> this
>>      > huge file to fetch data at various offsets within the file, I will
>> be
>>      > able to fetch the data faster for sure but, I will not be able to
>>     send
>>      > across the data immediately as and when I fetch it. I am expected
>> to
>>      > finish doing all parallel reads on the file till I read all the
>>     bytes,
>>      > then send across the wire to the client.
>>      >
>>      > Now if we can have some tag or header which can be introduced as
>>     part of
>>      > HTTP 2.0 which actually can help re-order the byte stream at the
>>     session
>>      > layer or at a layer in between the application and session layer,
>> we
>>      > could potentially improve the performance for File reads using
>>     HTTP by
>>      > making sure that the new module looks at this new tag and jumbles
>>     around
>>      > the data based on this tag and eventually presents the data to
>>     HTTP to
>>      > make it look all seamless.
>>      >
>>      > So the server can just do parallel reads on the same file at
>> various
>>      > offsets without worrying about ordering and send across the read
>>     chunks
>>      > and this new module sitting at the client side can intervene and
>>     look at
>>      > some form of tag/header to make sure it swaps the data
>>     accordingly and
>>      > waits till all data is received and then present it to the
>>     Application
>>      > protocol.
>>      >
>>      > NOTE: I am not referring to packet-reordering at TCP.
>>      >
>>      > By having this, servers can attain true parallelism in reading
>>     the file
>>      > and can effectively improve file transfer rates.
>>      >
>>      >
>>      > Thanks,
>>      >
>>      > Rakshith
>>      >
>>      >
>>      >
>>      > _______________________________________________
>>      > Architecture-discuss mailing list
>>      > Architecture-discuss@ietf.org <mailto:Architecture-discuss@
>> ietf.org>
>>      > https://www.ietf.org/mailman/listinfo/architecture-discuss
>>      >
>>      >
>>
>>
>>