Re: [nfsv4] New version of sparse draft (draft-hildebrand-nfsv4-read-sparse-01.txt)

"J. Bruce Fields" <bfields@fieldses.org> Thu, 30 September 2010 20:33 UTC

Return-Path: <bfields@fieldses.org>
X-Original-To: nfsv4@core3.amsl.com
Delivered-To: nfsv4@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 0DB353A6E5E for <nfsv4@core3.amsl.com>; Thu, 30 Sep 2010 13:33:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.533
X-Spam-Level:
X-Spam-Status: No, score=-2.533 tagged_above=-999 required=5 tests=[AWL=0.066, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 8MOGbMvY4pCq for <nfsv4@core3.amsl.com>; Thu, 30 Sep 2010 13:33:19 -0700 (PDT)
Received: from fieldses.org (fieldses.org [174.143.236.118]) by core3.amsl.com (Postfix) with ESMTP id CB7913A6E09 for <nfsv4@ietf.org>; Thu, 30 Sep 2010 13:33:19 -0700 (PDT)
Received: from bfields by fieldses.org with local (Exim 4.71) (envelope-from <bfields@fieldses.org>) id 1P1PpD-00058B-5e; Thu, 30 Sep 2010 16:33:55 -0400
Date: Thu, 30 Sep 2010 16:33:55 -0400
From: "J. Bruce Fields" <bfields@fieldses.org>
To: Dean Hildebrand <seattleplus@gmail.com>
Message-ID: <20100930203354.GE15207@fieldses.org>
References: <4CA3CE95.10407@gmail.com> <20100930175927.GB11836@fieldses.org> <4CA4D56A.7040905@gmail.com> <20100930183752.GC11836@fieldses.org> <20100930191627.GB15207@fieldses.org> <4CA4E3CC.2030900@gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <4CA4E3CC.2030900@gmail.com>
User-Agent: Mutt/1.5.20 (2009-06-14)
Cc: "nfsv4@ietf.org" <nfsv4@ietf.org>
Subject: Re: [nfsv4] New version of sparse draft (draft-hildebrand-nfsv4-read-sparse-01.txt)
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 30 Sep 2010 20:33:21 -0000

On Thu, Sep 30, 2010 at 12:23:56PM -0700, Dean Hildebrand wrote:
> 
> 
> On 9/30/2010 12:16 PM, J. Bruce Fields wrote:
> >On Thu, Sep 30, 2010 at 02:37:52PM -0400, J. Bruce Fields wrote:
> >>On Thu, Sep 30, 2010 at 11:22:34AM -0700, Dean Hildebrand wrote:
> >>>
> >>>On 9/30/2010 10:59 AM, J. Bruce Fields wrote:
> >>>>On Wed, Sep 29, 2010 at 04:41:09PM -0700, Dean Hildebrand wrote:
> >>>>>  Hello,
> >>>>>
> >>>>>I uploaded a new version of our internet draft "Simple and Efficient
> >>>>>Read Support for Sparse Files".
> >>>>>
> >>>>>http://www.ietf.org/id/draft-hildebrand-nfsv4-read-sparse-01.txt
> >>>>>
> >>>>>Please have a look and give us any feedback.
> >>>>>
> >>>>>The main changes are:
> >>>>>1) Added section regarding pNFS (which basically states that there
> >>>>>are no real changes to how pNFS works)
> >>>>>2) Added example sparse read message flow
> >>>>>3) Added statement to the effect that if a client sends a read
> >>>>>request for a chunk that ends in zeroes, the server can return a
> >>>>>short read (thanks to Benny)
> >>>>>4) I didn't change the return value.  There was a suggestion to use
> >>>>>a flag instead of overloaded the length parameter, but I'm not sure
> >>>>>what this gives us (although it might be more clear in the doc)
> >>>>How will the client take advantage of the data_length field?
> >>>I'm not 100% sure what you are asking, but all read requests into
> >>>the region specified by data_offset and data_length cannot be
> >>>assumed to be zero.  The data must come from either the client cache
> >>>or server.  The example was designed to answer this type of
> >>>question, does it help?
> >>I think I understand what data_offset *means*.  What I don't understand
> >>is how the client can use data_offset to improve performance.
> >Knowing that there may be nonzero values in the range
> >[data_offset,data_offset+data_length] isn't very useful--you're still
> >going to have to perform reads to get the data in that range.
> >
> >The information that really *is* useful is the information that all data
> >in the range [read offset,data_offset] is zero.
> >
> >So data_length doesn't give the client any new information it can use.
> >
> >Simpler, I think, would be to either drop data_length entirely, or to
> >return a range describing the extent of the current *hole*, instead of
> >the extent of the next *non-hole*.
> >
> >And just write the spec to say that the server return means that a read
> >of the returned range would have returned all zeroes.
> 
> I think the original motivation was for the client to adapt its
> requests using this field, so it wouldn't try to read into a hole.
> So instead of reading 64K, which might go past
> data_offset+data_length, it simply reads the appropriate amount of
> data up until data_offset+data_length.  Now, the server can always
> return a short read, but that does put the onus on the server to do
> fine-grained checking of zero and non-zero data for every read
> request, which may be quite onerous.

I don't understand.  Either the server finds out where the holes are, or
it doesn't--up to it.  If it knows where they are, then returning a
short read isn't any more or less hard than returning the correct
data_length would have been on the earlier read.

Or it could also just return the zeroes at the end of the read that
crosses the beginning of the next hole.  That doesn't strike me as a big
problem.  The client will still find out about a hole on its next read.

But, sure, I can believe that the client aligning its reads to with
respect to the possible boundary at data_offset+data_length could have
some benefit, and I didn't understand that before--thanks for pointing
it out.

There's still some question to me how beneficial that really would
be....

> So I still think it has a
> purpose and should be kept in the draft.
> 
> btw, I like your comment regarding allocated and unallocated data.
> I'll use that as motivation and refer to zero and non-zero data in
> the protocol description.

OK, thanks.

--b.