Re: [nfsv4] New version of sparse draft(draft-hildebrand-nfsv4-read-sparse-01.txt)

"Erasani, Pranoop" <Pranoop.Erasani@netapp.com> Sun, 03 October 2010 07:04 UTC

Return-Path: <Pranoop.Erasani@netapp.com>
X-Original-To: nfsv4@core3.amsl.com
Delivered-To: nfsv4@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id B0F843A6D7E for <nfsv4@core3.amsl.com>; Sun, 3 Oct 2010 00:04:27 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.424
X-Spam-Level:
X-Spam-Status: No, score=-6.424 tagged_above=-999 required=5 tests=[AWL=0.175, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Dbe6CmO41Kqt for <nfsv4@core3.amsl.com>; Sun, 3 Oct 2010 00:04:25 -0700 (PDT)
Received: from mx2.netapp.com (mx2.netapp.com [216.240.18.37]) by core3.amsl.com (Postfix) with ESMTP id 77CA03A6D44 for <nfsv4@ietf.org>; Sun, 3 Oct 2010 00:04:24 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.57,274,1283756400"; d="scan'208";a="462098698"
Received: from smtp2.corp.netapp.com ([10.57.159.114]) by mx2-out.netapp.com with ESMTP; 03 Oct 2010 00:05:16 -0700
Received: from sacrsexc2-prd.hq.netapp.com (sacrsexc2-prd.hq.netapp.com [10.99.115.28]) by smtp2.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id o9375GQm027374; Sun, 3 Oct 2010 00:05:16 -0700 (PDT)
Received: from SACMVEXC3-PRD.hq.netapp.com ([10.99.115.22]) by sacrsexc2-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959); Sun, 3 Oct 2010 00:05:16 -0700
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Sun, 03 Oct 2010 00:05:13 -0700
Message-ID: <43EEF8704A569749804F545E3306FCE30921518A@SACMVEXC3-PRD.hq.netapp.com>
In-Reply-To: <20101002203551.GC18079@fieldses.org>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [nfsv4] New version of sparse draft(draft-hildebrand-nfsv4-read-sparse-01.txt)
Thread-Index: ActicXKX0WeWT+0wSG+klKJEl+/ibQAVbLJw
References: <4CA3CE95.10407@gmail.com> <E043D9D8EE3B5743B8B174A814FD584F0A64D38F@TK5EX14MBXC124.redmond.corp.microsoft.com> <4CA63E48.5070903@gmail.com> <E043D9D8EE3B5743B8B174A814FD584F0A69A182@TK5EX14MBXC124.redmond.corp.microsoft.com> <4CA6482F.2000609@gmail.com> <4CA65309.4060600@panasas.com> <4CA65D71.50000@gmail.com> <4CA66570.3010207@panasas.com> <4CA6678B.3090902@gmail.com> <43EEF8704A569749804F545E3306FCE30910E822@SACMVEXC3-PRD.hq.netapp.com> <20101002203551.GC18079@fieldses.org>
From: "Erasani, Pranoop" <Pranoop.Erasani@netapp.com>
To: "J. Bruce Fields" <bfields@fieldses.org>
X-OriginalArrivalTime: 03 Oct 2010 07:05:16.0541 (UTC) FILETIME=[55126AD0:01CB62C9]
Cc: Benny Halevy <bhalevy@panasas.com>, nfsv4@ietf.org
Subject: Re: [nfsv4] New version of sparse draft(draft-hildebrand-nfsv4-read-sparse-01.txt)
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 03 Oct 2010 07:04:27 -0000

Bruce,

Thanks for getting back.

> Their current proposal is mainly just an optimization to allow
> returning
> long strings of zeroes to clients in read requests.  (Servers may be
> using file allocation information to identify long strings of zeroes
> efficiently, but that's an implementation detail.)
> 
> This isn't about allocation, or metadata--it's just a minor
incremental
> improvement to READ.

That's one interpretation. But, having to deprecate READ to provide this
feature
doesn't seem like a minor feature to me.

I don't read it as just an optimization. The stated goal seems to be
applicable all kinds of access patterns (i.e., it doesn't mention that
the algorithm is efficient for sequential pattern only).

The document says:

   This document extends the NFSv4.1 protocol to support efficient
   reading of sparse files.  The number of sparse files is growing in
   the data center, most notably due to the increasing number of virtual
   disk images.  This simple extension provides an easy and efficient
   way for administrators to copy and manage these files without wasting
   disk space or transferring data unnecessarily.

> 
> The proposal decreases the size of read responses, and may decrease
the
> number of read requests as well in the case of a sequential reader.
> 
> As such, agreed: it doesn't allow a lot of optimizations that an
> allocation map would.

Great. We agree on the first point.

> It's also simpler than an allocation-map operation:
> 
> 	- It's just a read, so we already know the semantics.
> 	- It never performs worse than ordinary read.
> 
> You may well be right that it simply isn't worth the trouble, whereas
a
> GET_HOLE_MAP operation would be.
> 
> But I think your proposal is sufficiently different that it should be
a
> separate proposal.  Beats me whether it would be a competitor to this
> one, or complementary to it.

I started with an intent to complement it, as I didn't see that the
current
proposal addresses many cases that I list below.

Basically, What are the requirements that we are trying to address with
this
proposal and what can we do to complete it to be beneficial to wider
range of
clients and servers.

> A few questions about a map:

Thanks for asking the questions.

I didn't myself think through the semantics of the hole map. So,
let's put the below questions in abeyance for a minute and address
my questions about requirements for the proposal by Dean. I promise
to come back with answers once I see that there is some direction on
whether the requirements need to be extended or whether another proposal
is warranted.

> 
> 	- What is its lifetime?  Will it be a recallable object like a
> 	  layout, or does the client invalidate it normally whenever it
> 	  would invalidate its data cache?

For now, I feel that it has to be the latter. Similar to ctime
invalidation 
that this proposal warrants.

- Pranoop

> 	- Does requesting the block map break write delegations, or (on
> 	  servers that support atime) update the atime?
> 	- How does a request for a map they interact with mandatory
> 	  locks?
> 
> --b.