Re: [nfsv4] Write-behind caching
Dean Hildebrand <seattleplus@gmail.com> Wed, 27 October 2010 05:06 UTC
Return-Path: <seattleplus@gmail.com>
X-Original-To: nfsv4@core3.amsl.com
Delivered-To: nfsv4@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 860D83A686E for <nfsv4@core3.amsl.com>; Tue, 26 Oct 2010 22:06:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.524
X-Spam-Level:
X-Spam-Status: No, score=-2.524 tagged_above=-999 required=5 tests=[AWL=0.075, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id aRvoA51Z8QTP for <nfsv4@core3.amsl.com>; Tue, 26 Oct 2010 22:06:21 -0700 (PDT)
Received: from mail-yx0-f172.google.com (mail-yx0-f172.google.com [209.85.213.172]) by core3.amsl.com (Postfix) with ESMTP id BF03B3A6826 for <nfsv4@ietf.org>; Tue, 26 Oct 2010 22:06:20 -0700 (PDT)
Received: by yxp4 with SMTP id 4so129894yxp.31 for <nfsv4@ietf.org>; Tue, 26 Oct 2010 22:08:09 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:received:received:message-id:date:from :user-agent:mime-version:to:subject:references:in-reply-to :content-type:content-transfer-encoding; bh=iijKUKn/aZ5bB+zQsZezdpSYsO8m1t50FarfTUOkQl4=; b=JJiv5dwb5iyS7hJ2hGvCu6CUcRI8INFdm9km60vJUxuxe67/iCGy3KH7WL8SxWIHDt dtdVixACkayESYJNrd+bN0xGr1gFxurrEZ+UlQQZ/4OFn4yohic2VeHP3/m+ADmOzbKH WuEcxqAm2IH62yu6UXnVC2dYr0QZqGZq2EtKI=
DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; b=gwY1vXUQX+W5AFVM3eQ3gMNoCPmaRqa9KLiVzr+f/XQ1PmHOPbraEmKZCIAtGuS5Gl 64u/mPF2FUjx9Oour1HAML3NSy+yV7KRrQGi142wF5k0NzIJ65p9AYn5iHOZ36L33KU7 uDcJPzrTroVPWNk2+cCbjUqzt5gDYsL13I3Jw=
Received: by 10.91.62.3 with SMTP id p3mr346662agk.105.1288156088990; Tue, 26 Oct 2010 22:08:08 -0700 (PDT)
Received: from [192.168.1.42] (pool-71-112-60-10.sttlwa.dsl-w.verizon.net [71.112.60.10]) by mx.google.com with ESMTPS id 3sm11253977ano.21.2010.10.26.22.08.06 (version=SSLv3 cipher=RC4-MD5); Tue, 26 Oct 2010 22:08:07 -0700 (PDT)
Message-ID: <4CC7B3AE.8000802@gmail.com>
Date: Tue, 26 Oct 2010 22:07:58 -0700
From: Dean Hildebrand <seattleplus@gmail.com>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.2.11) Gecko/20101013 Thunderbird/3.1.5
MIME-Version: 1.0
To: nfsv4@ietf.org
References: <BF3BB6D12298F54B89C8DCC1E4073D80028C76DB@CORPUSMX50A.corp.emc.com> <E043D9D8EE3B5743B8B174A814FD584F0D498D54@TK5EX14MBXC126.redmond.corp.microsoft.com> <BF3BB6D12298F54B89C8DCC1E4073D80028C76E0@CORPUSMX50A.corp.emc.com> <E043D9D8EE3B5743B8B174A814FD584F0D498E1D@TK5EX14MBXC126.redmond.corp.microsoft.com> <BF3BB6D12298F54B89C8DCC1E4073D80028C76EA@CORPUSMX50A.corp.emc.com>
In-Reply-To: <BF3BB6D12298F54B89C8DCC1E4073D80028C76EA@CORPUSMX50A.corp.emc.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Subject: Re: [nfsv4] Write-behind caching
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 27 Oct 2010 05:06:22 -0000
I remember at one time there was a thought that all dirty data would have to be written to disk when it receives a layoutrecall. Once the data was written, it would send a layoutreturn. I think this was the thinking before all the timing issues and other such things cropped up. I assume someone wrote that as general advice, somehow thinking that responding to a layoutrecall was more important than actually achieving good write performance. In this light, the analogy with delegreturn makes sense if you take a very specific example, but obviously not in general. I would vote to just cut this text, as I think it is simply outdated. Dean On 10/26/2010 3:34 AM, david.noveck@emc.com wrote: > That makes sense. Let me take on this issue with regard to the file > layout. Are there volunteers to address it with regard to block and > object? It would be great if we could get together in Beijing, discuss > this, and come to a joint conclusion to present to the working group > (via email I mean). I'm not planning trying to do this before the > working group meeting. In any case, I'm pretty sure there won't be any > time during the working group meeting. > > -----Original Message----- > From: Spencer Shepler [mailto:sshepler@microsoft.com] > Sent: Monday, October 25, 2010 11:34 PM > To: Noveck, David; nfsv4@ietf.org > Subject: RE: [nfsv4] Write-behind caching > > > Fair enough. I haven't looked to see if the layout types > address this specific, needed, behavior. Obviously the > statement you reference and the individual layout descriptions > should be tied together. Again, I don't remember but there > may be layout specific steps needed in the case of handling > layoutreturns. > > In any case, we can handle the eventual conclusion as an errata. > > Spencer > > >> -----Original Message----- >> From: david.noveck@emc.com [mailto:david.noveck@emc.com] >> Sent: Monday, October 25, 2010 8:25 PM >> To: Spencer Shepler; nfsv4@ietf.org >> Subject: RE: [nfsv4] Write-behind caching >> >> I agree that the intent was to cover a variety of layout types. >> >> I think what you are saying about the issue of different throughputs > for >> having and not having layouts also makes sense. It may in some way > have >> led to the statement in RFC5661 but those statements are by no means > the >> same. They have different consequences. I take it that you are > saying >> (correctly) something like: >> >> However, write-behind implementations will generally need to > bound >> the amount of unwritten date so that given the bandwidth of the >> output path, the data can be written in a reasonable time. > Clients >> which have layouts should avoid keeping larger amounts to reflect > a >> situation in which a layout provides a write path of higher >> bandwidth. >> This is because a CB_LAYOUTRECALL may be received. The client >> should not delay returning the layout so as to use that higher- >> bandwidth >> path, so it is best if it assumes, in limiting the amount of data >> to be written, that the write bandwidth is only what is available >> without the layout, and that it uses this bandwidth assumption > even >> if it does happen to have a layout. >> >> This differs from the text in RFC5661 in a few respects. >> >> First it says that the amount of dirty data should be the same > when >> you have the layout and when you don't, rather than simply > saying it >> should be small when you have the layout, possibly implying that > it >> should be smaller than when you don't have a layout. >> >> Second the text now in RFC5661 strongly implies that when you > get >> CB_LAYOUTRECALL, you would normally start new IO's, rather than >> simply drain the pending IO's and return the layout ASAP. >> >> So I don't agree that what is in RFC5661 is good implementation > advice, >> particularly in suggesting that clients should delay the LAYOUTRETURN >> while doing a bunch of IO, including starting new IO's. >> >> >> -----Original Message----- >> From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] On Behalf > Of >> Spencer Shepler >> Sent: Monday, October 25, 2010 10:07 PM >> To: Noveck, David; nfsv4@ietf.org >> Subject: Re: [nfsv4] Write-behind caching >> >> >> Since this description is part of the general pNFS description, the > intent >> may have been to cover a variety of layout types. However, I agree > that >> the client is not guaranteed access to the layout and is fully capable > of >> writing the data via the MDS if all else fails (inability to obtain > the >> layout after a return); it may not be the most performant path but it >> should be functional. And maybe that is the source of the statement > that >> the client should take care in managing its dirty pages given the lack > of >> guarantee of access to the supposed, higher throughput path for > writing >> data. >> >> As implementation guidance it seems okay but truly a requirement for >> correct function. >> >> Spencer >> >>> -----Original Message----- >>> From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] On > Behalf >> Of >>> david.noveck@emc.com >>> Sent: Monday, October 25, 2010 6:58 PM >>> To: nfsv4@ietf.org >>> Subject: [nfsv4] Write-behind caching >>> >>> The following statement appears at the bottom of page 292 of > RFC5661. >>> However, write-behind caching may negatively >>> affect the latency in returning a layout in response to a >>> CB_LAYOUTRECALL; this is similar to file delegations and the > impact >>> that file data caching has on DELEGRETURN. Client > implementations >>> SHOULD limit the amount of unwritten data they have outstanding > at >>> any one time in order to prevent excessively long responses to >>> CB_LAYOUTRECALL. >>> >>> This does not seem to make sense to me. >>> >>> First of all the analogy between DELEGRETURN and >>> CB_LAYOUTRECALL/LAYOUTRETURN doesn't seem to me to be correct. In > the >>> case of DELEGRETURN, at least if the file in question has been > closed, >>> during the pendency of the delegation, you do need to write all of > the >>> dirty data associated with those previously open files. Normally, >> clients >>> just write all dirty data. >>> >>> LAYOUTRETURN does not have that sort of requirement. If it is valid >> to >>> hold the dirty data when you do have the layout, it is just as valid >> to >>> hold it when you don't. You could very well return the layout and > get >> it >>> again before some of those dirty blocks are written. Having a > layout >>> grants you the right to do IO using a particular means (different >> based on >>> the mapping type), but if you don't have the layout, you still have > a >> way >>> to do the writeback, and there is no particular need to write back > all >> the >>> data before returning the layout. As mentioned above, you may well >> get >>> the layout again before there is any need to actually do the >> write-back. >>> You have to wait until IO's that are in flight are completed before >> you >>> return the layout. However, I don't see why you would have to or > want >> to >>> start new IO's using the layout if you have received a >> CB_LAYOUTRECALL.. >>> Am I missing something? Is there some valid reason for this >> statement? >>> Or should this be dealt with via the errata mechanism? >>> >>> What do existing clients actually do with pending writeback data > when >> they >>> get a CB_LAYOUTRECALL? Do they start new IO's using the layout? >>> If so, is there any other reason other than the paragraph above? >>> _______________________________________________ >>> nfsv4 mailing list >>> nfsv4@ietf.org >>> https://www.ietf.org/mailman/listinfo/nfsv4 >> _______________________________________________ >> nfsv4 mailing list >> nfsv4@ietf.org >> https://www.ietf.org/mailman/listinfo/nfsv4 >> > > _______________________________________________ > nfsv4 mailing list > nfsv4@ietf.org > https://www.ietf.org/mailman/listinfo/nfsv4
- [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Spencer Shepler
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Spencer Shepler
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Dean Hildebrand
- Re: [nfsv4] Write-behind caching Jason Glasgow
- Re: [nfsv4] Write-behind caching Trond Myklebust
- Re: [nfsv4] Write-behind caching Jason Glasgow
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Benny Halevy
- Re: [nfsv4] Write-behind caching sfaibish
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Benny Halevy
- Re: [nfsv4] Write-behind caching sfaibish
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Trond Myklebust
- Re: [nfsv4] Write-behind caching sfaibish
- Re: [nfsv4] Write-behind caching Trond Myklebust
- Re: [nfsv4] Write-behind caching Trond Myklebust
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Trond Myklebust
- Re: [nfsv4] Write-behind caching Thomas Haynes
- Re: [nfsv4] Write-behind caching sfaibish
- Re: [nfsv4] Write-behind caching Benny Halevy
- Re: [nfsv4] Write-behind caching Benny Halevy
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Benny Halevy
- Re: [nfsv4] Write-behind caching david.black
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Trond Myklebust
- Re: [nfsv4] Write-behind caching david.black
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Benny Halevy
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Trond Myklebust
- Re: [nfsv4] Write-behind caching david.noveck
- Re: [nfsv4] Write-behind caching Benny Halevy
- Re: [nfsv4] Write-behind caching david.noveck