Re: [arch-d] HTTP 2.0 + Performance improvement Idea
Rakshith Venkatesh <vrock28@gmail.com> Thu, 27 March 2014 04:06 UTC
Return-Path: <vrock28@gmail.com>
X-Original-To: architecture-discuss@ietfa.amsl.com
Delivered-To: architecture-discuss@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 7681B1A0293 for <architecture-discuss@ietfa.amsl.com>; Wed, 26 Mar 2014 21:06:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.749
X-Spam-Level:
X-Spam-Status: No, score=-1.749 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, FREEMAIL_ENVFROM_END_DIGIT=0.25, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_PASS=-0.001] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id VHtZv32JYQOC for <architecture-discuss@ietfa.amsl.com>; Wed, 26 Mar 2014 21:06:42 -0700 (PDT)
Received: from mail-vc0-x232.google.com (mail-vc0-x232.google.com [IPv6:2607:f8b0:400c:c03::232]) by ietfa.amsl.com (Postfix) with ESMTP id 3DFD31A0438 for <architecture-discuss@ietf.org>; Wed, 26 Mar 2014 21:06:42 -0700 (PDT)
Received: by mail-vc0-f178.google.com with SMTP id im17so3564110vcb.37 for <architecture-discuss@ietf.org>; Wed, 26 Mar 2014 21:06:40 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc:content-type; bh=1AWP9jkSmlC+o+96qsRPnTBB5+p6w1UOHrzzlRSrWC0=; b=Hk6JB4aAlMVtfKXmSqsjBVR6IwVxMqnfZ8wO0HEC4vneDCmpzvq+wV2L100UIkHCFb g87B38l3F6YbveyGSCrlLMy4l+Bf7rKbFEwQpaFpHyJeHZU8YSOrVsFD7dlP/ocLBYrr 9r0VXi3/6tkPkD3vc69EkA2baLR1KXTMsCA27YGJn8GjFpeYM+m3rfjLVOk2DryHvlgH G938aKaxyPdXfUlTb5Qxc1hW0IJNw01ik4mJ91PouVBsh8TtG5srV+qvKZwBeCzTJllF UOwPQBqX16K2GUAA9/da1oNPJeKlqt1a9UVcUZB1bdJi4DWoni7dQUWkcAQ5chRv9Lac XDVQ==
X-Received: by 10.221.27.8 with SMTP id ro8mr1851vcb.30.1395893200440; Wed, 26 Mar 2014 21:06:40 -0700 (PDT)
MIME-Version: 1.0
Received: by 10.58.92.233 with HTTP; Wed, 26 Mar 2014 21:06:20 -0700 (PDT)
In-Reply-To: <5332D44B.1010003@isi.edu>
References: <CANw0z+Wy09iGvwL2DgzkMLdNxcwxOHmd38yxGz0H6v=FGpzEJw@mail.gmail.com> <5331F25C.20803@isi.edu> <CANw0z+Vy1imt-HmZdqVpNzq4-Gd7cKC5B=PmAe7bBYrcHCD2mQ@mail.gmail.com> <5EE5585A-19E4-4940-B21A-4BA208F08B78@isi.edu> <CANw0z+U2tYkBmU4zP_HLxN1xR0p0ko08ZAsUhzeZfqMXk6c4pw@mail.gmail.com> <5332D44B.1010003@isi.edu>
From: Rakshith Venkatesh <vrock28@gmail.com>
Date: Thu, 27 Mar 2014 09:36:20 +0530
Message-ID: <CANw0z+VpUPdiJ299XNFiLkuYspTwcB+OK94Jzg8CaBpQsYaiTg@mail.gmail.com>
To: Joe Touch <touch@isi.edu>
Content-Type: multipart/alternative; boundary="001a11336baa5b95c404f58eb69e"
Archived-At: http://mailarchive.ietf.org/arch/msg/architecture-discuss/7wibQyz_RkGMWolE0cvvRLtBiz0
Cc: architecture-discuss@ietf.org
Subject: Re: [arch-d] HTTP 2.0 + Performance improvement Idea
X-BeenThere: architecture-discuss@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: open discussion forum for long/wide-range architectural issues <architecture-discuss.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/architecture-discuss>, <mailto:architecture-discuss-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/architecture-discuss/>
List-Post: <mailto:architecture-discuss@ietf.org>
List-Help: <mailto:architecture-discuss-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/architecture-discuss>, <mailto:architecture-discuss-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 27 Mar 2014 04:06:45 -0000
>the way you initiated your system may be causing the out-of-order responses. The main intention was to have a complete file lock to the reader and have these X workers do the job without having to worry about the order in which these workers return. >So let's assume you could transmit these blocks in the order they return from reads. Have you considered the impact on the receive side? Yes. I agree to your point that at the client side one must have enough resources to hold the out of order buffers without affecting other applications. So i wanted to know if its a good trade off between utilizing more resources at the client/Host side than on the server side while trying to achieve parallelism. What is your take on this and also how can i go ahead with this from here. Thanks, Rakshith On Wed, Mar 26, 2014 at 6:51 PM, Joe Touch <touch@isi.edu> wrote: > > > On 3/25/2014 11:20 PM, Rakshith Venkatesh wrote: > >> I think i should be explaining what was done and what we saw as >> improvements. What i meant was lets say the block size of a file system >> is 64kB. Now, if a file is lets say 512GB, in the normal world, i would >> fetch 64kB of data sequentially until i read all of data. Now every 64KB >> data sent across the wire could get chunked by TCP and the chunks would >> get re-ordered at the client side to present the 64KB data to the >> application in-tact. Please note that i am not referring to this >> re-ordering as part of my idea. >> >> Further, what was done was divide the whole file into 64KB zones and >> have X workers waiting to fetch each zone. So 1 to X workers just went >> ahead reading the first X zones of data. Whoever came back first among >> these X workers was assigned the task of reading the X+1 zone and so on >> and so forth. >> > > That sounds like you tried to start all X workers at the same time, but > such events are rarely truly simultaneous. > > I.e., the way you initiated your system may be causing the out-of-order > responses. Or there could be a resource interaction in which the order you > see is a side-effect of different processes or threads interacting with > shared locks. > > > This experiment saw a minimum performance improvement of 20-30% in terms >> of time taken to read the file. The only thing that stopped us from >> sending the reads across the wire was that the client expected the data >> bytes to be in-order. i.e: if the any worker came back first before the >> 1st worker, i could not send the data that was fetched by the that >> worker because of the constraint that i have mentioned previously. >> > > So let's assume you could transmit these blocks in the order they return > from reads. Have you considered the impact on the receive side? I.e., you'd > need to do the same study on the impact of out of order writes on > performance. > > > This also led to a situation where i had to hold onto the buffer >> resources even though i completed reading sub-portions of the file >> because fetching the whole file was yet to be finished. >> > > Sure, but unless your send and receive side are exactly matched in terms > of the order impact on performance, one side will end up holding buffers > too. > > I.e., it sounds like you have some OS research to complete before you know > you'll be able to take advantage of block-oriented framing in the transport > protocol. > > Joe > > >> >> On Wed, Mar 26, 2014 at 10:46 AM, Joe Touch <touch@isi.edu >> <mailto:touch@isi.edu>> wrote: >> >> >> On Mar 25, 2014, at 6:01 PM, Rakshith Venkatesh <vrock28@gmail.com >> <mailto:vrock28@gmail.com>> wrote: >> >> > I am not sure if an NFS client can accept blocks out of order >> from a server. I need to check on that. >> >> NFS should be matching read responses to requests; order should not >> matter, esp. because when using UDP the requests and responses can >> be reordered by the network. >> >> > If it does then NFS over HTTP sounds good. The way i see it is >> any File service protocol such as NFS, SMB (Large MTU's), HTTP, FTP >> should have a way which would help the server achieve true >> parallelism in reading a file and not worry about sending >> data/blocks in-order. >> >> FTP has block mode that should support any order of response, but if >> they're all going over the same TCP connection it can help only with >> the source access order rather than network reordering. >> >> HTTP has chunking that should help with this (e.g., using a chunk >> extension field that indicates chunk offset). >> >> However, most file systems are engineered to manage files via >> in-order access, even when parallelized, so I'm not sure how much >> this will all help. I.e., I don't know if your assumption is valid, >> that overriding the serial access will be faster. >> >> It might be useful to show that you can get to the components of a >> file faster as you assume first. >> >> Joe >> >> >> > >> > Rakshith >> > >> > >> > On Wed, Mar 26, 2014 at 2:47 AM, Joe Touch <touch@isi.edu >> <mailto:touch@isi.edu>> wrote: >> > You sound like you're looking for NFS; maybe you should propose >> an "NSF over HTTP" variant where responses can be issued out of order? >> > >> > Joe >> > >> > >> > On 3/25/2014 4:56 AM, Rakshith Venkatesh wrote: >> > Hi, >> > >> > I was going through SPDY draft or the HTTP 2.0 initial draft. I >> had an >> > idea which I think would require a change in the architecture of >> the >> > same and so I am dropping this mail. Here is the idea: >> > >> > A Http client expects the server to send data in-order. (NOTE: >> When I >> > say a server, I am referring to an appliance to which disks are >> attached >> > and the file resides on the disk.). If there is a file let's say >> 10GB >> > and if the client asks for this file from the server, the data >> has to be >> > given in-order from the 1^st byte till the last byte. Now let's >> say I >> > >> > implement an engine at the server side to do a parallel read on >> this >> > huge file to fetch data at various offsets within the file, I will >> be >> > able to fetch the data faster for sure but, I will not be able to >> send >> > across the data immediately as and when I fetch it. I am expected >> to >> > finish doing all parallel reads on the file till I read all the >> bytes, >> > then send across the wire to the client. >> > >> > Now if we can have some tag or header which can be introduced as >> part of >> > HTTP 2.0 which actually can help re-order the byte stream at the >> session >> > layer or at a layer in between the application and session layer, >> we >> > could potentially improve the performance for File reads using >> HTTP by >> > making sure that the new module looks at this new tag and jumbles >> around >> > the data based on this tag and eventually presents the data to >> HTTP to >> > make it look all seamless. >> > >> > So the server can just do parallel reads on the same file at >> various >> > offsets without worrying about ordering and send across the read >> chunks >> > and this new module sitting at the client side can intervene and >> look at >> > some form of tag/header to make sure it swaps the data >> accordingly and >> > waits till all data is received and then present it to the >> Application >> > protocol. >> > >> > NOTE: I am not referring to packet-reordering at TCP. >> > >> > By having this, servers can attain true parallelism in reading >> the file >> > and can effectively improve file transfer rates. >> > >> > >> > Thanks, >> > >> > Rakshith >> > >> > >> > >> > _______________________________________________ >> > Architecture-discuss mailing list >> > Architecture-discuss@ietf.org <mailto:Architecture-discuss@ >> ietf.org> >> > https://www.ietf.org/mailman/listinfo/architecture-discuss >> > >> > >> >> >>
- [arch-d] HTTP 2.0 + Performance improvement Idea Rakshith Venkatesh
- Re: [arch-d] HTTP 2.0 + Performance improvement I… Joe Touch
- Re: [arch-d] HTTP 2.0 + Performance improvement I… Rakshith Venkatesh
- Re: [arch-d] HTTP 2.0 + Performance improvement I… Joe Touch
- Re: [arch-d] HTTP 2.0 + Performance improvement I… Rakshith Venkatesh
- Re: [arch-d] HTTP 2.0 + Performance improvement I… Joe Touch
- Re: [arch-d] HTTP 2.0 + Performance improvement I… Rakshith Venkatesh
- Re: [arch-d] HTTP 2.0 + Performance improvement I… Joe Touch