Re: [nfsv4] Write-behind caching

On 2010-10-26 05:24, david.noveck@emc.com wrote:
> I agree that the intent was to cover a variety of layout types.
> 
> I think what you are saying about the issue of different throughputs for
> having and not having layouts also makes sense.  It may in some way have
> led to the statement in RFC5661 but those statements are by no means the
> same.  They have different consequences.  I take it that you are saying
> (correctly) something like:
> 
>      However, write-behind implementations will generally need to bound
>      the amount of unwritten date so that given the bandwidth of the 
>      output path, the data can be written in a reasonable time.  Clients
> 
>      which have layouts should avoid keeping larger amounts to reflect a
>      situation in which a layout provides a write path of higher
> bandwidth.
>      This is because a CB_LAYOUTRECALL may be received.  The client
>      should not delay returning the layout so as to use that
> higher-bandwidth
>      path, so it is best if it assumes, in limiting the amount of data
>      to be written, that the write bandwidth is only what is available
>      without the layout, and that it uses this bandwidth assumption even
>      if it does happen to have a layout.
> 
> This differs from the text in RFC5661 in a few respects.
> 
> 	First it says that the amount of dirty data should be the same
> when
> 	you have the layout and when you don't, rather than simply
> saying it
> 	should be small when you have the layout, possibly implying that
> it
>  	should be smaller than when you don't have a layout.
> 
> 	Second the text now in RFC5661 strongly implies that when you
> get
> 	CB_LAYOUTRECALL, you would normally start new IO's, rather than 
>       simply drain the pending IO's and return the layout ASAP. 
> 
> So I don't agree that what is in RFC5661 is good implementation advice,
> particularly in suggesting that clients should delay the LAYOUTRETURN
> while doing a bunch of IO, including starting new IO's.

That what clora_changed is for.
It's up to the server to provide this hint to the client
and it's up to the client to throttle its dirty cache write-behind in response
to CB_LAYOUTRECALL.
If in your implementation flushing dirty data always seem to be a bad idea
the server can just always set clora_changed to true (though the hint name
is somewhat too specific for the implied semantics)

Benny

>  
> 
> -----Original Message-----
> From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] On Behalf
> Of Spencer Shepler
> Sent: Monday, October 25, 2010 10:07 PM
> To: Noveck, David; nfsv4@ietf.org
> Subject: Re: [nfsv4] Write-behind caching
> 
> 
> Since this description is part of the general pNFS description, the
> intent may have been to cover a variety of layout types.  However,
> I agree that the client is not guaranteed access to the layout and
> is fully capable of writing the data via the MDS if all else
> fails (inability to obtain the layout after a return); it may not
> be the most performant path but it should be functional.  And maybe
> that is the source of the statement that the client should take
> care in managing its dirty pages given the lack of guarantee of
> access to the supposed, higher throughput path for writing data.
> 
> As implementation guidance it seems okay but truly a requirement
> for correct function.
> 
> Spencer
> 
>> -----Original Message-----
>> From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] On Behalf
> Of
>> david.noveck@emc.com
>> Sent: Monday, October 25, 2010 6:58 PM
>> To: nfsv4@ietf.org
>> Subject: [nfsv4] Write-behind caching
>>
>> The following statement appears at the bottom of page 292 of RFC5661.
>>
>>    However, write-behind caching may negatively
>>    affect the latency in returning a layout in response to a
>>    CB_LAYOUTRECALL; this is similar to file delegations and the impact
>>    that file data caching has on DELEGRETURN.  Client implementations
>>    SHOULD limit the amount of unwritten data they have outstanding at
>>    any one time in order to prevent excessively long responses to
>>    CB_LAYOUTRECALL.
>>
>> This does not seem to make sense to me.
>>
>> First of all the analogy between DELEGRETURN and
>> CB_LAYOUTRECALL/LAYOUTRETURN doesn't seem to me to be correct.  In the
>> case of DELEGRETURN, at least if the file in question has been closed,
>> during the pendency of the delegation, you do need to write all of the
>> dirty data associated with those previously open files.  Normally,
> clients
>> just write all dirty data.
>>
>> LAYOUTRETURN does not have that sort of requirement.  If it is valid
> to
>> hold the dirty data when you do have the layout, it is just as valid
> to
>> hold it when you don't.  You could very well return the layout and get
> it
>> again before some of those dirty blocks are written.  Having a layout
>> grants you the right to do IO using a particular means (different
> based on
>> the mapping type), but if you don't have the layout, you still have a
> way
>> to do the writeback, and there is no particular need to write back all
> the
>> data before returning the layout.  As mentioned above, you may well
> get
>> the layout again before there is any need to actually do the
> write-back.
>>
>> You have to wait until IO's that are in flight are completed before
> you
>> return the layout.  However, I don't see why you would have to or want
> to
>> start new IO's using the layout if you have received a
> CB_LAYOUTRECALL..
>>
>> Am I missing something?  Is there some valid reason for this
> statement?
>> Or should this be dealt with via the errata mechanism?
>>
>> What do existing clients actually do with pending writeback data when
> they
>> get a CB_LAYOUTRECALL?  Do they start new IO's using the layout?
>> If so, is there any other reason other than the paragraph above?
>> _______________________________________________
>> nfsv4 mailing list
>> nfsv4@ietf.org
>> https://www.ietf.org/mailman/listinfo/nfsv4
> 
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4
> 
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4