Re: [nfsv4] New version of sparse draft(draft-hildebrand-nfsv4-read-sparse-01.txt)

<Daniel.Muntz@emc.com> Mon, 04 October 2010 17:28 UTC

Return-Path: <Daniel.Muntz@emc.com>
X-Original-To: nfsv4@core3.amsl.com
Delivered-To: nfsv4@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 4AA443A6FE2 for <nfsv4@core3.amsl.com>; Mon, 4 Oct 2010 10:28:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tNMeOIToaYG4 for <nfsv4@core3.amsl.com>; Mon, 4 Oct 2010 10:28:46 -0700 (PDT)
Received: from mexforward.lss.emc.com (mexforward.lss.emc.com [128.222.32.20]) by core3.amsl.com (Postfix) with ESMTP id 1D9213A6CC5 for <nfsv4@ietf.org>; Mon, 4 Oct 2010 10:28:45 -0700 (PDT)
Received: from hop04-l1d11-si03.isus.emc.com (HOP04-L1D11-SI03.isus.emc.com [10.254.111.23]) by mexforward.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id o94HTX3C025425 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 4 Oct 2010 13:29:33 -0400
Received: from mailhub.lss.emc.com (mailhub.lss.emc.com [10.254.221.145]) by hop04-l1d11-si03.isus.emc.com (RSA Interceptor); Mon, 4 Oct 2010 13:29:21 -0400
Received: from corpussmtp4.corp.emc.com (corpussmtp4.corp.emc.com [10.254.169.197]) by mailhub.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id o94HTBGS007221; Mon, 4 Oct 2010 13:29:13 -0400
Received: from mxhub02.corp.emc.com ([10.254.141.104]) by corpussmtp4.corp.emc.com with Microsoft SMTPSVC(6.0.3790.4675); Mon, 4 Oct 2010 13:29:12 -0400
Received: from mxhub03.corp.emc.com (10.254.141.105) by mxhub02.corp.emc.com (10.254.141.104) with Microsoft SMTP Server (TLS) id 8.2.254.0; Mon, 4 Oct 2010 13:29:11 -0400
Received: from MX05A.corp.emc.com ([169.254.1.25]) by mxhub03.corp.emc.com ([10.254.141.105]) with mapi; Mon, 4 Oct 2010 13:29:11 -0400
From: Daniel.Muntz@emc.com
To: bhalevy@panasas.com, thomas@netapp.com
Date: Mon, 04 Oct 2010 13:29:09 -0400
Thread-Topic: [nfsv4] New version of sparse draft(draft-hildebrand-nfsv4-read-sparse-01.txt)
Thread-Index: Actj1JCOYASJHuliQHKEl21eEbqWhAAE8/RQ
Message-ID: <DE966DA98A4ABE438D726BDF1699CF61015853904C@MX05A.corp.emc.com>
References: <1166093344.184.1286118643092.JavaMail.root@thunderbeast.private.linuxbox.com> <1962942208.186.1286119228985.JavaMail.root@thunderbeast.private.linuxbox.com> <43EEF8704A569749804F545E3306FCE30921519A@SACMVEXC3-PRD.hq.netapp.com> <5CD1CFC3-51A4-4DBC-9E7F-EFF1990A0330@netapp.com> <AANLkTi=48A2eny3BW4FGMG1jw4RoLGJnGfmHY2TQT+=a@mail.gmail.com>
In-Reply-To: <AANLkTi=48A2eny3BW4FGMG1jw4RoLGJnGfmHY2TQT+=a@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-OriginalArrivalTime: 04 Oct 2010 17:29:12.0283 (UTC) FILETIME=[A8ED12B0:01CB63E9]
X-EMM-MHVC: 1
Cc: bfields@fieldses.org, Pranoop.Erasani@netapp.com, nfsv4@ietf.org
Subject: Re: [nfsv4] New version of sparse draft(draft-hildebrand-nfsv4-read-sparse-01.txt)
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 04 Oct 2010 17:28:47 -0000

 

> -----Original Message-----
> From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] 
> On Behalf Of Benny Halevy
> Sent: Monday, October 04, 2010 7:55 AM
> To: Thomas Haynes
> Cc: J. Bruce Fields; Erasani, Pranoop; nfsv4@ietf.org
> Subject: Re: [nfsv4] New version of sparse 
> draft(draft-hildebrand-nfsv4-read-sparse-01.txt)
> 
> On Sun, Oct 3, 2010 at 10:54 PM, Thomas Haynes 
> <thomas@netapp.com> wrote:
> > The harder case is for a non-clustered pNFS community, i.e., one
> > where the MDS and DSes need a control protocol to talk. Let's
> > assume that is the case here.
> >
> > A hole is created when a WRITE happens at a DS that is not
> > contiguous with the previous (if any) WRITE.
> >
> > If a DS is supposed to return the range until the next non-zero
> > data, then it will need to contact every other DS or perhaps
> > the MDS.
> >
> > If the MDS is supposed to honor a GET_SPARSE_BLOCK_MAP
> > type of request from the client, then either each DS would have
> > to send hole entries to the MDS or the MDS would have to
> > probe each one.
> >
> > What if instead of a DS returning information about other DSes,
> > it simply returns information about what it knows? I.e., my next
> > data is at address X.
> 
> I think that this is much cleaner from the protocol perspective.
> The DS is essentially serving NFSv4.1 so its READ result applies
> to the current filehandle which represents the slice stored
> on it, not the logical file striped across multiple DSs.
> 
> Benny

I like this approach as well.  However it does present some complexity.  The value returned to the client has to be interpreted by the client, using the layout, to determine whether data may exist on another DS in the _possible_ hole.  Dense vs. sparse striping must also be taken into account.  The DS can only return the next byte offset in the stripe file where data exists (having no knowledge of stripe boundaries or dense/sparse striping).

  -Dan

> 
> >
> > Consider a file with a width of 32k. The first write it 
> gets is at 1M
> > and then it gets 5 other contiguous writes. The next data is at
> > 2M and it gets 4 other contiguous writes. The DS can't assume
> > that any other DS has zeros in the first 1M. Worst case, all other
> > N-1 DSes do have data there.
> >
> > In that scenario, N-1 DSes will return 32k on the first read and
> > 1 DS will return a hole for it only until 1M. The client will then
> > only send N-1 READs until it it gets to the 1M mark.
> >
> >
> > _______________________________________________
> > nfsv4 mailing list
> > nfsv4@ietf.org
> > https://www.ietf.org/mailman/listinfo/nfsv4
> >
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4
> 
>