Re: [nfsv4] Write-behind caching

I remember at one time there was a thought that all dirty data would 
have to be written to disk when it receives a layoutrecall.  Once the 
data was written, it would send a layoutreturn.  I think this was the 
thinking before all the timing issues and other such things cropped up.  
I assume someone wrote that as general advice, somehow thinking that 
responding to a layoutrecall was more important than actually achieving 
good write performance.

In this light, the analogy with delegreturn makes sense if you take a 
very specific example, but obviously not in general.

I would vote to just cut this text, as I think it is simply outdated.
Dean

On 10/26/2010 3:34 AM, david.noveck@emc.com wrote:
> That makes sense.  Let me take on this issue with regard to the file
> layout.  Are there volunteers to address it with regard to block and
> object?  It would be great if we could get together in Beijing, discuss
> this, and come to a joint conclusion to present to the working group
> (via email I mean).  I'm not planning trying to do this before the
> working group meeting.  In any case, I'm pretty sure there won't be any
> time during the working group meeting.
>
> -----Original Message-----
> From: Spencer Shepler [mailto:sshepler@microsoft.com]
> Sent: Monday, October 25, 2010 11:34 PM
> To: Noveck, David; nfsv4@ietf.org
> Subject: RE: [nfsv4] Write-behind caching
>
>
> Fair enough.  I haven't looked to see if the layout types
> address this specific, needed, behavior.  Obviously the
> statement you reference and the individual layout descriptions
> should be tied together.  Again, I don't remember but there
> may be layout specific steps needed in the case of handling
> layoutreturns.
>
> In any case, we can handle the eventual conclusion as an errata.
>
> Spencer
>
>
>> -----Original Message-----
>> From: david.noveck@emc.com [mailto:david.noveck@emc.com]
>> Sent: Monday, October 25, 2010 8:25 PM
>> To: Spencer Shepler; nfsv4@ietf.org
>> Subject: RE: [nfsv4] Write-behind caching
>>
>> I agree that the intent was to cover a variety of layout types.
>>
>> I think what you are saying about the issue of different throughputs
> for
>> having and not having layouts also makes sense.  It may in some way
> have
>> led to the statement in RFC5661 but those statements are by no means
> the
>> same.  They have different consequences.  I take it that you are
> saying
>> (correctly) something like:
>>
>>       However, write-behind implementations will generally need to
> bound
>>       the amount of unwritten date so that given the bandwidth of the
>>       output path, the data can be written in a reasonable time.
> Clients
>>       which have layouts should avoid keeping larger amounts to reflect
> a
>>       situation in which a layout provides a write path of higher
>> bandwidth.
>>       This is because a CB_LAYOUTRECALL may be received.  The client
>>       should not delay returning the layout so as to use that higher-
>> bandwidth
>>       path, so it is best if it assumes, in limiting the amount of data
>>       to be written, that the write bandwidth is only what is available
>>       without the layout, and that it uses this bandwidth assumption
> even
>>       if it does happen to have a layout.
>>
>> This differs from the text in RFC5661 in a few respects.
>>
>> 	First it says that the amount of dirty data should be the same
> when
>> 	you have the layout and when you don't, rather than simply
> saying it
>> 	should be small when you have the layout, possibly implying that
> it
>>   	should be smaller than when you don't have a layout.
>>
>> 	Second the text now in RFC5661 strongly implies that when you
> get
>> 	CB_LAYOUTRECALL, you would normally start new IO's, rather than
>>        simply drain the pending IO's and return the layout ASAP.
>>
>> So I don't agree that what is in RFC5661 is good implementation
> advice,
>> particularly in suggesting that clients should delay the LAYOUTRETURN
>> while doing a bunch of IO, including starting new IO's.
>>
>>
>> -----Original Message-----
>> From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] On Behalf
> Of
>> Spencer Shepler
>> Sent: Monday, October 25, 2010 10:07 PM
>> To: Noveck, David; nfsv4@ietf.org
>> Subject: Re: [nfsv4] Write-behind caching
>>
>>
>> Since this description is part of the general pNFS description, the
> intent
>> may have been to cover a variety of layout types.  However, I agree
> that
>> the client is not guaranteed access to the layout and is fully capable
> of
>> writing the data via the MDS if all else fails (inability to obtain
> the
>> layout after a return); it may not be the most performant path but it
>> should be functional.  And maybe that is the source of the statement
> that
>> the client should take care in managing its dirty pages given the lack
> of
>> guarantee of access to the supposed, higher throughput path for
> writing
>> data.
>>
>> As implementation guidance it seems okay but truly a requirement for
>> correct function.
>>
>> Spencer
>>
>>> -----Original Message-----
>>> From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] On
> Behalf
>> Of
>>> david.noveck@emc.com
>>> Sent: Monday, October 25, 2010 6:58 PM
>>> To: nfsv4@ietf.org
>>> Subject: [nfsv4] Write-behind caching
>>>
>>> The following statement appears at the bottom of page 292 of
> RFC5661.
>>>     However, write-behind caching may negatively
>>>     affect the latency in returning a layout in response to a
>>>     CB_LAYOUTRECALL; this is similar to file delegations and the
> impact
>>>     that file data caching has on DELEGRETURN.  Client
> implementations
>>>     SHOULD limit the amount of unwritten data they have outstanding
> at
>>>     any one time in order to prevent excessively long responses to
>>>     CB_LAYOUTRECALL.
>>>
>>> This does not seem to make sense to me.
>>>
>>> First of all the analogy between DELEGRETURN and
>>> CB_LAYOUTRECALL/LAYOUTRETURN doesn't seem to me to be correct.  In
> the
>>> case of DELEGRETURN, at least if the file in question has been
> closed,
>>> during the pendency of the delegation, you do need to write all of
> the
>>> dirty data associated with those previously open files.  Normally,
>> clients
>>> just write all dirty data.
>>>
>>> LAYOUTRETURN does not have that sort of requirement.  If it is valid
>> to
>>> hold the dirty data when you do have the layout, it is just as valid
>> to
>>> hold it when you don't.  You could very well return the layout and
> get
>> it
>>> again before some of those dirty blocks are written.  Having a
> layout
>>> grants you the right to do IO using a particular means (different
>> based on
>>> the mapping type), but if you don't have the layout, you still have
> a
>> way
>>> to do the writeback, and there is no particular need to write back
> all
>> the
>>> data before returning the layout.  As mentioned above, you may well
>> get
>>> the layout again before there is any need to actually do the
>> write-back.
>>> You have to wait until IO's that are in flight are completed before
>> you
>>> return the layout.  However, I don't see why you would have to or
> want
>> to
>>> start new IO's using the layout if you have received a
>> CB_LAYOUTRECALL..
>>> Am I missing something?  Is there some valid reason for this
>> statement?
>>> Or should this be dealt with via the errata mechanism?
>>>
>>> What do existing clients actually do with pending writeback data
> when
>> they
>>> get a CB_LAYOUTRECALL?  Do they start new IO's using the layout?
>>> If so, is there any other reason other than the paragraph above?
>>> _______________________________________________
>>> nfsv4 mailing list
>>> nfsv4@ietf.org
>>> https://www.ietf.org/mailman/listinfo/nfsv4
>> _______________________________________________
>> nfsv4 mailing list
>> nfsv4@ietf.org
>> https://www.ietf.org/mailman/listinfo/nfsv4
>>
>
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4