Re: [nfsv4] New version of sparse draft(draft-hildebrand-nfsv4-read-sparse-01.txt)

"Erasani, Pranoop" <> Sun, 03 October 2010 07:13 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 8E0413A6D7E for <>; Sun, 3 Oct 2010 00:13:20 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -6.159
X-Spam-Status: No, score=-6.159 tagged_above=-999 required=5 tests=[AWL=-0.160, BAYES_00=-2.599, J_CHICKENPOX_47=0.6, RCVD_IN_DNSWL_MED=-4]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id E+eVnxKlLdse for <>; Sun, 3 Oct 2010 00:13:18 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id D7E453A6CDE for <>; Sun, 3 Oct 2010 00:13:17 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.57,274,1283756400"; d="scan'208";a="462101375"
Received: from ([]) by with ESMTP; 03 Oct 2010 00:14:10 -0700
Received: from ( []) by (8.13.1/8.13.1/NTAP-1.6) with ESMTP id o937EAO1004103; Sun, 3 Oct 2010 00:14:10 -0700 (PDT)
Received: from ([]) by with Microsoft SMTPSVC(6.0.3790.3959); Sun, 3 Oct 2010 00:14:10 -0700
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: base64
Date: Sun, 03 Oct 2010 00:14:08 -0700
Message-ID: <>
In-Reply-To: <>
Thread-Topic: [nfsv4] New version of sparse draft(draft-hildebrand-nfsv4-read-sparse-01.txt)
Thread-Index: ActiddidKSovBkYGQM6jmn2NszeangAMtUBg
References: <> <>
From: "Erasani, Pranoop" <>
To: "Matt W. Benjamin" <>, "J. Bruce Fields" <>
X-OriginalArrivalTime: 03 Oct 2010 07:14:10.0414 (UTC) FILETIME=[9348F8E0:01CB62CA]
Cc: Benny Halevy <>,
Subject: Re: [nfsv4] New version of sparse draft(draft-hildebrand-nfsv4-read-sparse-01.txt)
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sun, 03 Oct 2010 07:13:20 -0000

> Hi,
> Just relative to pNFS, my immediate reaction was that a DS might have
> the relevant file allocation information, and the MDS might not. 

DS might have relevant file allocation information on data on itself, 
not on other DSs'. AFAIK, pNFS does not mandate that each DS know
about other DSs' information. It's the MDS that has access to DS

Given the current proposal, as part of the READ response, the DS is supposed to
send the offset where the hole ends and the actual data begins. If the hole ends
on a different DS, how is that DS responding to READ supposed to know this.
How does the data server get to compute the information? Are all DSs' conscious
Of how data is organized on other DSs' or is it the duty of the MDS to own
that information? This is where, I feel that the existing proposal puts onerous 
requirements on the pNFS Data Servers.

> With
> READZ (is that the current operation name?), this doesn't seem to
> present a problem. 

The draft seems to addresses the problem by just mentioning that each DS needs
to know other DS's information.

   receives.  In addition, when a data server is returning a
   READ4reshole structure, it should still contain the offset and length
   of the next allocated block in the file, even if that block is not
   located on that particular data server.

However, that is not a requirement that pNFS puts on the data servers. Isn't it? Are
we wading into a path of new requirements for pNFS data servers here?

If my suspicion is true, the spirit of this proposal would discourage pNFS servers
from implementing the sparse hints.

> It would, I guess, perhaps also not present a
> problem if clients could get a hole map from a DS, but I think that's
> not what the prior email seemed to be describing?

Well.. I started with the fact that some vendors could consider that hole map
could be metadata  and thus implied that MDS would be in a better position to
serve that rather than individual data servers (especially, if the holes
span data servers).

To address the pNFS specific concerns from my original e-mail, we need to answer:

1). Who owns the hole information
2). Who sends the hole information
3). How efficiently, they can communicate hole information spanning pNFS server set

- Pranoop

> Thanks,
> Matt
> ----- "J. Bruce Fields" <> wrote:
> >
> > A few questions about a map:
> >
> > 	- What is its lifetime?  Will it be a recallable object like a
> > 	  layout, or does the client invalidate it normally whenever it
> > 	  would invalidate its data cache?
> > 	- Does requesting the block map break write delegations, or (on
> > 	  servers that support atime) update the atime?
> > 	- How does a request for a map they interact with mandatory
> > 	  locks?
> >
> > --b.
> > _______________________________________________
> > nfsv4 mailing list
> >
> >
> --
> Matt Benjamin
> The Linux Box
> 206 South Fifth Ave. Suite 150
> Ann Arbor, MI  48104
> tel. 734-761-4689
> fax. 734-769-8938
> cel. 734-216-5309