Re: [nfsv4] Write-behind caching
Benny Halevy <bhalevy@panasas.com> Wed, 27 October 2010 16:25 UTC
Return-Path: <bhalevy@panasas.com>
X-Original-To: nfsv4@core3.amsl.com
Delivered-To: nfsv4@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id D9E913A6934 for <nfsv4@core3.amsl.com>; Wed, 27 Oct 2010 09:25:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.569
X-Spam-Level:
X-Spam-Status: No, score=-6.569 tagged_above=-999 required=5 tests=[AWL=0.030, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gXrvbxjhVMeu for <nfsv4@core3.amsl.com>; Wed, 27 Oct 2010 09:25:51 -0700 (PDT)
Received: from exprod5og108.obsmtp.com (exprod5og108.obsmtp.com [64.18.0.186]) by core3.amsl.com (Postfix) with SMTP id 2597F3A6911 for <nfsv4@ietf.org>; Wed, 27 Oct 2010 09:25:51 -0700 (PDT)
Received: from source ([67.152.220.89]) by exprod5ob108.postini.com ([64.18.4.12]) with SMTP ID DSNKTMhS/UV0M6zPn4s4f6kB6b0mSKwAPR/8@postini.com; Wed, 27 Oct 2010 09:27:41 PDT
Received: from fs1.bhalevy.com ([172.17.33.166]) by daytona.int.panasas.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 27 Oct 2010 12:27:40 -0400
Message-ID: <4CC852FA.4050903@panasas.com>
Date: Wed, 27 Oct 2010 18:27:38 +0200
From: Benny Halevy <bhalevy@panasas.com>
User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.9) Gecko/20100921 Fedora/3.1.4-1.fc13 Thunderbird/3.1.4
MIME-Version: 1.0
To: david.noveck@emc.com
References: <BF3BB6D12298F54B89C8DCC1E4073D80028C76DB@CORPUSMX50A.corp.emc.com> <E043D9D8EE3B5743B8B174A814FD584F0D498D54@TK5EX14MBXC126.redmond.corp.microsoft.com> <BF3BB6D12298F54B89C8DCC1E4073D80028C76E0@CORPUSMX50A.corp.emc.com>
In-Reply-To: <BF3BB6D12298F54B89C8DCC1E4073D80028C76E0@CORPUSMX50A.corp.emc.com>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
X-OriginalArrivalTime: 27 Oct 2010 16:27:40.0237 (UTC) FILETIME=[DFCBCFD0:01CB75F3]
Cc: nfsv4@ietf.org
Subject: Re: [nfsv4] Write-behind caching
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 27 Oct 2010 16:25:53 -0000
On 2010-10-26 05:24, david.noveck@emc.com wrote: > I agree that the intent was to cover a variety of layout types. > > I think what you are saying about the issue of different throughputs for > having and not having layouts also makes sense. It may in some way have > led to the statement in RFC5661 but those statements are by no means the > same. They have different consequences. I take it that you are saying > (correctly) something like: > > However, write-behind implementations will generally need to bound > the amount of unwritten date so that given the bandwidth of the > output path, the data can be written in a reasonable time. Clients > > which have layouts should avoid keeping larger amounts to reflect a > situation in which a layout provides a write path of higher > bandwidth. > This is because a CB_LAYOUTRECALL may be received. The client > should not delay returning the layout so as to use that > higher-bandwidth > path, so it is best if it assumes, in limiting the amount of data > to be written, that the write bandwidth is only what is available > without the layout, and that it uses this bandwidth assumption even > if it does happen to have a layout. > > This differs from the text in RFC5661 in a few respects. > > First it says that the amount of dirty data should be the same > when > you have the layout and when you don't, rather than simply > saying it > should be small when you have the layout, possibly implying that > it > should be smaller than when you don't have a layout. > > Second the text now in RFC5661 strongly implies that when you > get > CB_LAYOUTRECALL, you would normally start new IO's, rather than > simply drain the pending IO's and return the layout ASAP. > > So I don't agree that what is in RFC5661 is good implementation advice, > particularly in suggesting that clients should delay the LAYOUTRETURN > while doing a bunch of IO, including starting new IO's. That what clora_changed is for. It's up to the server to provide this hint to the client and it's up to the client to throttle its dirty cache write-behind in response to CB_LAYOUTRECALL. If in your implementation flushing dirty data always seem to be a bad idea the server can just always set clora_changed to true (though the hint name is somewhat too specific for the implied semantics) Benny > > > -----Original Message----- > From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] On Behalf > Of Spencer Shepler > Sent: Monday, October 25, 2010 10:07 PM > To: Noveck, David; nfsv4@ietf.org > Subject: Re: [nfsv4] Write-behind caching > > > Since this description is part of the general pNFS description, the > intent may have been to cover a variety of layout types. However, > I agree that the client is not guaranteed access to the layout and > is fully capable of writing the data via the MDS if all else > fails (inability to obtain the layout after a return); it may not > be the most performant path but it should be functional. And maybe > that is the source of the statement that the client should take > care in managing its dirty pages given the lack of guarantee of > access to the supposed, higher throughput path for writing data. > > As implementation guidance it seems okay but truly a requirement > for correct function. > > Spencer > >> -----Original Message----- >> From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] On Behalf > Of >> david.noveck@emc.com >> Sent: Monday, October 25, 2010 6:58 PM >> To: nfsv4@ietf.org >> Subject: [nfsv4] Write-behind caching >> >> The following statement appears at the bottom of page 292 of RFC5661. >> >> However, write-behind caching may negatively >> affect the latency in returning a layout in response to a >> CB_LAYOUTRECALL; this is similar to file delegations and the impact >> that file data caching has on DELEGRETURN. Client implementations >> SHOULD limit the amount of unwritten data they have outstanding at >> any one time in order to prevent excessively long responses to >> CB_LAYOUTRECALL. >> >> This does not seem to make sense to me. >> >> First of all the analogy between DELEGRETURN and >> CB_LAYOUTRECALL/LAYOUTRETURN doesn't seem to me to be correct. In the >> case of DELEGRETURN, at least if the file in question has been closed, >> during the pendency of the delegation, you do need to write all of the >> dirty data associated with those previously open files. Normally, > clients >> just write all dirty data. >> >> LAYOUTRETURN does not have that sort of requirement. If it is valid > to >> hold the dirty data when you do have the layout, it is just as valid > to >> hold it when you don't. You could very well return the layout and get > it >> again before some of those dirty blocks are written. Having a layout >> grants you the right to do IO using a particular means (different > based on >> the mapping type), but if you don't have the layout, you still have a > way >> to do the writeback, and there is no particular need to write back all > the >> data before returning the layout. As mentioned above, you may well > get >> the layout again before there is any need to actually do the > write-back. >> >> You have to wait until IO's that are in flight are completed before > you >> return the layout. However, I don't see why you would have to or want > to >> start new IO's using the layout if you have received a > CB_LAYOUTRECALL.. >> >> Am I missing something? Is there some valid reason for this > statement? >> Or should this be dealt with via the errata mechanism? >> >> What do existing clients actually do with pending writeback data when > they >> get a CB_LAYOUTRECALL? Do they start new IO's using the layout? >> If so, is there any other reason other than the paragraph above? >> _______________________________________________ >> nfsv4 mailing list >> nfsv4@ietf.org >> https://www.ietf.org/mailman/listinfo/nfsv4 > > _______________________________________________ > nfsv4 mailing list > nfsv4@ietf.org > https://www.ietf.org/mailman/listinfo/nfsv4 > > _______________________________________________ > nfsv4 mailing list > nfsv4@ietf.org > https://www.ietf.org/mailman/listinfo/nfsv4
- [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Spencer Shepler
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Spencer Shepler
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Dean Hildebrand
- Re: [nfsv4] Write-behind caching Jason Glasgow
- Re: [nfsv4] Write-behind caching Trond Myklebust
- Re: [nfsv4] Write-behind caching Jason Glasgow
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Benny Halevy
- Re: [nfsv4] Write-behind caching sfaibish
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Benny Halevy
- Re: [nfsv4] Write-behind caching sfaibish
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Trond Myklebust
- Re: [nfsv4] Write-behind caching sfaibish
- Re: [nfsv4] Write-behind caching Trond Myklebust
- Re: [nfsv4] Write-behind caching Trond Myklebust
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Trond Myklebust
- Re: [nfsv4] Write-behind caching Thomas Haynes
- Re: [nfsv4] Write-behind caching sfaibish
- Re: [nfsv4] Write-behind caching Benny Halevy
- Re: [nfsv4] Write-behind caching Benny Halevy
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Benny Halevy
- Re: [nfsv4] Write-behind caching david.black
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Trond Myklebust
- Re: [nfsv4] Write-behind caching david.black
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Benny Halevy
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Trond Myklebust
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Benny Halevy
- Re: [nfsv4] Write-behind caching david.noveck