Re: [nfsv4] New version of sparsedraft(draft-hildebrand-nfsv4-read-sparse-01.txt)

"Roy, Dipankar" <Dipankar.Roy@netapp.com> Mon, 04 October 2010 07:52 UTC

Return-Path: <Dipankar.Roy@netapp.com>
X-Original-To: nfsv4@core3.amsl.com
Delivered-To: nfsv4@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 87B663A6F3B for <nfsv4@core3.amsl.com>; Mon, 4 Oct 2010 00:52:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.198
X-Spam-Level:
X-Spam-Status: No, score=-6.198 tagged_above=-999 required=5 tests=[AWL=-0.200, BAYES_00=-2.599, HTML_MESSAGE=0.001, J_CHICKENPOX_47=0.6, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id T+T1RrJoT+EM for <nfsv4@core3.amsl.com>; Mon, 4 Oct 2010 00:52:45 -0700 (PDT)
Received: from mx2.netapp.com (mx2.netapp.com [216.240.18.37]) by core3.amsl.com (Postfix) with ESMTP id 75F103A6F47 for <nfsv4@ietf.org>; Mon, 4 Oct 2010 00:52:45 -0700 (PDT)
X-IronPort-AV: E=Sophos; i="4.57,277,1283756400"; d="scan'208,217"; a="462411146"
Received: from smtp1.corp.netapp.com ([10.57.156.124]) by mx2-out.netapp.com with ESMTP; 04 Oct 2010 00:53:40 -0700
Received: from sacrsexc2-prd.hq.netapp.com (sacrsexc2-prd.hq.netapp.com [10.99.115.28]) by smtp1.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id o947rdRR027982; Mon, 4 Oct 2010 00:53:39 -0700 (PDT)
Received: from btcrsexc1-prd.hq.netapp.com ([10.73.251.109]) by sacrsexc2-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959); Mon, 4 Oct 2010 00:53:39 -0700
Received: from BTCMVEXC1-PRD.hq.netapp.com ([10.73.251.107]) by btcrsexc1-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959); Mon, 4 Oct 2010 13:23:35 +0530
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CB6399.3EB9C6BF"
Date: Mon, 04 Oct 2010 13:23:34 +0530
Message-ID: <5FFF8E9361C5D84EBBAF70C1931CDDE30BCADB8C@BTCMVEXC1-PRD.hq.netapp.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [nfsv4] New version of sparsedraft(draft-hildebrand-nfsv4-read-sparse-01.txt)
Thread-Index: ActiddidKSovBkYGQM6jmn2NszeangAMtUBgABQqszkAC86rMAAbhG2r
References: <65692186.46.1286052807331.JavaMail.root@thunderbeast.private.linuxbox.com><1992175237.48.1286053647885.JavaMail.root@thunderbeast.private.linuxbox.com> <43EEF8704A569749804F545E3306FCE30921518B@SACMVEXC3-PRD.hq.netapp.com> <5FFF8E9361C5D84EBBAF70C1931CDDE30BCADB88@BTCMVEXC1-PRD.hq.netapp.com> <43EEF8704A569749804F545E3306FCE309215199@SACMVEXC3-PRD.hq.netapp.com>
From: "Roy, Dipankar" <Dipankar.Roy@netapp.com>
To: "Erasani, Pranoop" <Pranoop.Erasani@netapp.com>, "Matt W. Benjamin" <matt@linuxbox.com>, "J. Bruce Fields" <bfields@fieldses.org>
X-OriginalArrivalTime: 04 Oct 2010 07:53:35.0003 (UTC) FILETIME=[3F1A5EB0:01CB6399]
Cc: Benny Halevy <bhalevy@panasas.com>, nfsv4@ietf.org
Subject: Re: [nfsv4] New version of sparsedraft(draft-hildebrand-nfsv4-read-sparse-01.txt)
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Oct 2010 07:52:47 -0000

Hi Pranoop,

Dean and myself were discussing these use cases in an offline thread.
We haven't reached any definite conclusions about these. Please note
that by "sparse map" in the following, I mean a structure showing
holey regions of a file, without the need to performing a READ to
get the same information.

1. It would be nice to have a mechanism where a client can get a sparse
map of a bunch of files together (similar to a READDIR), instead of
performing READ on each individual file.

2. A sparse map enables clients and servers to share the information
instead of each server and client performing READ multiple times to
get the same map. I believe this will be useful for server side RFC
implementation and peer to peer NFS.

3. Since we are considering the READ only performance here, it is
unclear whether a sparse map will be more efficient to figure out
holes for a read delegation.

4. If the client punches holes to a file, i.e, the write holes case
would it make sense to perform a READ of the same area to know if
that is a hole.

Thanks
Dipankar 

-----Original Message-----
From: Erasani, Pranoop
Sent: Sun 10/3/2010 11:57 PM
To: Roy, Dipankar; Matt W. Benjamin; J. Bruce Fields
Cc: Benny Halevy; nfsv4@ietf.org
Subject: RE: [nfsv4] New version of sparsedraft(draft-hildebrand-nfsv4-read-sparse-01.txt)
 
Hi Dipankar,

Great.

You had mentioned other use cases in one of your private e-mails. Would you mind sending e-mail to the WG alias as well?

- Pranoop

> -----Original Message-----
> From: Roy, Dipankar
> Sent: Sunday, October 03, 2010 5:49 AM
> To: Erasani, Pranoop; Matt W. Benjamin; J. Bruce Fields
> Cc: Benny Halevy; nfsv4@ietf.org
> Subject: RE: [nfsv4] New version of sparsedraft(draft-hildebrand-nfsv4-
> read-sparse-01.txt)
> 
> Hi Pranoop,
> 
> Thanks a lot for proposing this.
> 
> I think the NFS server side copy RFC implementation can definitely
> benefit from this.
> 
> Regards
> Dipankar
> 
> -----Original Message-----
> From: Erasani, Pranoop
> Sent: Sun 10/3/2010 12:44 PM
> To: Matt W. Benjamin; J. Bruce Fields
> Cc: Benny Halevy; nfsv4@ietf.org
> Subject: Re: [nfsv4] New version of sparsedraft(draft-hildebrand-nfsv4-
> read-sparse-01.txt)
> 
> > Hi,
> >
> > Just relative to pNFS, my immediate reaction was that a DS might have
> > the relevant file allocation information, and the MDS might not.
> 
> DS might have relevant file allocation information on data on itself,
> not on other DSs'. AFAIK, pNFS does not mandate that each DS know
> about other DSs' information. It's the MDS that has access to DS
> information.
> 
> Given the current proposal, as part of the READ response, the DS is
> supposed to
> send the offset where the hole ends and the actual data begins. If the
> hole ends
> on a different DS, how is that DS responding to READ supposed to know
> this.
> How does the data server get to compute the information? Are all DSs'
> conscious
> Of how data is organized on other DSs' or is it the duty of the MDS to
> own
> that information? This is where, I feel that the existing proposal puts
> onerous
> requirements on the pNFS Data Servers.
> 
> > With
> > READZ (is that the current operation name?), this doesn't seem to
> > present a problem.
> 
> The draft seems to addresses the problem by just mentioning that each
> DS needs
> to know other DS's information.
> 
>    receives.  In addition, when a data server is returning a
>    READ4reshole structure, it should still contain the offset and
> length
>    of the next allocated block in the file, even if that block is not
>    located on that particular data server.
> 
> However, that is not a requirement that pNFS puts on the data servers.
> Isn't it? Are
> we wading into a path of new requirements for pNFS data servers here?
> 
> If my suspicion is true, the spirit of this proposal would discourage
> pNFS servers
> from implementing the sparse hints.
> 
> > It would, I guess, perhaps also not present a
> > problem if clients could get a hole map from a DS, but I think that's
> > not what the prior email seemed to be describing?
> 
> Well.. I started with the fact that some vendors could consider that
> hole map
> could be metadata  and thus implied that MDS would be in a better
> position to
> serve that rather than individual data servers (especially, if the
> holes
> span data servers).
> 
> To address the pNFS specific concerns from my original e-mail, we need
> to answer:
> 
> 1). Who owns the hole information
> 2). Who sends the hole information
> 3). How efficiently, they can communicate hole information spanning
> pNFS server set
> 
> - Pranoop
> 
> > Thanks,
> >
> > Matt
> >
> > ----- "J. Bruce Fields" <bfields@fieldses.org> wrote:
> >
> > >
> > > A few questions about a map:
> > >
> > > 	- What is its lifetime?  Will it be a recallable object like a
> > > 	  layout, or does the client invalidate it normally whenever it
> > > 	  would invalidate its data cache?
> > > 	- Does requesting the block map break write delegations, or (on
> > > 	  servers that support atime) update the atime?
> > > 	- How does a request for a map they interact with mandatory
> > > 	  locks?
> > >
> > > --b.
> > > _______________________________________________
> > > nfsv4 mailing list
> > > nfsv4@ietf.org
> > > https://www.ietf.org/mailman/listinfo/nfsv4
> >
> > --
> >
> > Matt Benjamin
> >
> > The Linux Box
> > 206 South Fifth Ave. Suite 150
> > Ann Arbor, MI  48104
> >
> > http://linuxbox.com
> >
> > tel. 734-761-4689
> > fax. 734-769-8938
> > cel. 734-216-5309
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4