Re: [nfsv4] New version of sparse draft(draft-hildebrand-nfsv4-read-sparse-01.txt)

"Erasani, Pranoop" <Pranoop.Erasani@netapp.com> Sun, 03 October 2010 18:28 UTC

Return-Path: <Pranoop.Erasani@netapp.com>
X-Original-To: nfsv4@core3.amsl.com
Delivered-To: nfsv4@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 1C7DF3A6C6E for <nfsv4@core3.amsl.com>; Sun, 3 Oct 2010 11:28:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.113
X-Spam-Level:
X-Spam-Status: No, score=-6.113 tagged_above=-999 required=5 tests=[AWL=-0.114, BAYES_00=-2.599, J_CHICKENPOX_47=0.6, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WTmGDpa-mpx5 for <nfsv4@core3.amsl.com>; Sun, 3 Oct 2010 11:28:26 -0700 (PDT)
Received: from mx2.netapp.com (mx2.netapp.com [216.240.18.37]) by core3.amsl.com (Postfix) with ESMTP id 483903A6C2F for <nfsv4@ietf.org>; Sun, 3 Oct 2010 11:28:26 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.57,275,1283756400"; d="scan'208";a="462240857"
Received: from smtp1.corp.netapp.com ([10.57.156.124]) by mx2-out.netapp.com with ESMTP; 03 Oct 2010 11:29:19 -0700
Received: from sacrsexc2-prd.hq.netapp.com (sacrsexc2-prd.hq.netapp.com [10.99.115.28]) by smtp1.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id o93ITJvp001630; Sun, 3 Oct 2010 11:29:19 -0700 (PDT)
Received: from SACMVEXC3-PRD.hq.netapp.com ([10.99.115.21]) by sacrsexc2-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959); Sun, 3 Oct 2010 11:29:20 -0700
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
Date: Sun, 03 Oct 2010 11:29:17 -0700
Message-ID: <43EEF8704A569749804F545E3306FCE30921519A@SACMVEXC3-PRD.hq.netapp.com>
In-Reply-To: <1962942208.186.1286119228985.JavaMail.root@thunderbeast.private.linuxbox.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [nfsv4] New version of sparse draft(draft-hildebrand-nfsv4-read-sparse-01.txt)
Thread-Index: ActjDo51eD6eoJN7QJWZglA3jwhzCQAGKytg
References: <1166093344.184.1286118643092.JavaMail.root@thunderbeast.private.linuxbox.com> <1962942208.186.1286119228985.JavaMail.root@thunderbeast.private.linuxbox.com>
From: "Erasani, Pranoop" <Pranoop.Erasani@netapp.com>
To: "Matt W. Benjamin" <matt@linuxbox.com>
X-OriginalArrivalTime: 03 Oct 2010 18:29:20.0299 (UTC) FILETIME=[E50EDFB0:01CB6328]
Cc: "J. Bruce Fields" <bfields@fieldses.org>, Benny Halevy <bhalevy@panasas.com>, nfsv4@ietf.org
Subject: Re: [nfsv4] New version of sparse draft(draft-hildebrand-nfsv4-read-sparse-01.txt)
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 03 Oct 2010 18:28:29 -0000


> -----Original Message-----
> From: Matt W. Benjamin [mailto:matt@linuxbox.com]
> Sent: Sunday, October 03, 2010 8:20 AM
> To: Erasani, Pranoop
> Cc: Benny Halevy; nfsv4@ietf.org; J. Bruce Fields
> Subject: Re: [nfsv4] New version of sparse draft(draft-hildebrand-
> nfsv4-read-sparse-01.txt)
> 
> Hi Pranoop,
> 
> I noted myself that the next offset return requirement. was potentially
> onerous, for the reason you state.  However, the response from Dean and
> explicated by Bruce is that the DS was not meant to be obligated to
> return the next filled offset, merely to return the information it has.
> (Yes, you've raised an interesting point about this.)

Yup, if pNFS data servers are not obliged to return this, there is not
much value w.r.t pNFS.

> It still seems to me, offhand, just as problematic for a different
> subset of implementations to have a new requirement to propagate hole
> information (synchronously?) to the MDS, as to have the DSes
> collectively aware of it.  That is, is it not a potential advantage of
> pNFS, as currently specified, that the MDS need not have global hole
> information either?

I totally agree. I was not requiring that all MDSs' have information
about all DSs' holes. My wording initially may gave given some
hint as to this requirement, but the intent was not to require. But, to
state that our server could potentially have the information.

But, in general, an MDS is more likely to have information about DS than
a DS about another DS.

- Pranoop

> 
> Thanks,
> 
> Matt
> 
> ----- "Pranoop Erasani" <Pranoop.Erasani@netapp.com> wrote:
> 
> > > Hi,
> > >
> > > Just relative to pNFS, my immediate reaction was that a DS might
> > have
> > > the relevant file allocation information, and the MDS might not.
> >
> > DS might have relevant file allocation information on data on itself,
> >
> > not on other DSs'. AFAIK, pNFS does not mandate that each DS know
> > about other DSs' information. It's the MDS that has access to DS
> > information.
> >
> > Given the current proposal, as part of the READ response, the DS is
> > supposed to
> > send the offset where the hole ends and the actual data begins. If
> the
> > hole ends
> > on a different DS, how is that DS responding to READ supposed to know
> > this.
> > How does the data server get to compute the information? Are all DSs'
> > conscious
> > Of how data is organized on other DSs' or is it the duty of the MDS
> to
> > own
> > that information? This is where, I feel that the existing proposal
> > puts onerous
> > requirements on the pNFS Data Servers.
> >
> > > With
> > > READZ (is that the current operation name?), this doesn't seem to
> > > present a problem.
> >
> > The draft seems to addresses the problem by just mentioning that each
> > DS needs
> > to know other DS's information.
> 
> This was clarified in subsequent email, as I attempt to summarize
> above.
> 
> >
> >    receives.  In addition, when a data server is returning a
> >    READ4reshole structure, it should still contain the offset and
> > length
> >    of the next allocated block in the file, even if that block is not
> >    located on that particular data server.
> >
> > However, that is not a requirement that pNFS puts on the data
> servers.
> > Isn't it? Are
> > we wading into a path of new requirements for pNFS data servers here?
> >
> > If my suspicion is true, the spirit of this proposal would discourage
> > pNFS servers
> > from implementing the sparse hints.
> >
> > > It would, I guess, perhaps also not present a
> > > problem if clients could get a hole map from a DS, but I think
> > that's
> > > not what the prior email seemed to be describing?
> >
> > Well.. I started with the fact that some vendors could consider that
> > hole map
> > could be metadata  and thus implied that MDS would be in a better
> > position to
> > serve that rather than individual data servers (especially, if the
> > holes
> > span data servers).
> >
> > To address the pNFS specific concerns from my original e-mail, we
> need
> > to answer:
> >
> > 1). Who owns the hole information
> > 2). Who sends the hole information
> > 3). How efficiently, they can communicate hole information spanning
> > pNFS server set
> >
> --
> 
> Matt Benjamin
> 
> The Linux Box
> 206 South Fifth Ave. Suite 150
> Ann Arbor, MI  48104
> 
> http://linuxbox.com
> 
> tel. 734-761-4689
> fax. 734-769-8938
> cel. 734-216-5309