Re: [nfsv4] New version of sparse draft(draft-hildebrand-nfsv4-read-sparse-01.txt)

Benny Halevy <> Mon, 04 October 2010 14:54 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id CBB893A6F3B for <>; Mon, 4 Oct 2010 07:54:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.977
X-Spam-Status: No, score=-1.977 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, FM_FORGED_GMAIL=0.622]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id SafvmjARRmRt for <>; Mon, 4 Oct 2010 07:54:33 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id C339C3A6E95 for <>; Mon, 4 Oct 2010 07:54:33 -0700 (PDT)
Received: by qyk5 with SMTP id 5so1503805qyk.10 for <>; Mon, 04 Oct 2010 07:55:28 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;; s=gamma; h=domainkey-signature:mime-version:received:sender:received :in-reply-to:references:date:x-google-sender-auth:message-id:subject :from:to:cc:content-type; bh=/GbKi9arT5yvNSVMxxGpqHYbrXoUgP55eEYP/t2R71c=; b=pesadrVkUi6sVEky2gtJ5QOOdAoVzwgX6EKYU+rrknLl1l9dmNtJ7SOLlcUFxvDf7f mjLVfuNKUGE+g5K/trzvkZ5NXDBYsaqnkFle3kxfl24aVy1k4xGjYxYBjU3MNFc0ZOou fTGAH4F+is3Gaew/6kW0LP30hK6xZvNEww5X8=
DomainKey-Signature: a=rsa-sha1; c=nofws;; s=gamma; h=mime-version:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:cc:content-type; b=nOAnyHA+I5BLu72aDYYFDM+B4RRBsHFdhYWJm/usi4kMlCt6NNBtRmhZ6YTZvVPg1r cAAA6JJ5O3cF4JQ89I63cjJ9/2g9Sou0lNUQNB5QCgMeWtcceDAUUgLg1sAGX0U0Slaq UGBhGS4Oc8MynELRtutV4hVjcbbTdH+KGrGPM=
MIME-Version: 1.0
Received: by with SMTP id l7mr7016197qac.58.1286204128634; Mon, 04 Oct 2010 07:55:28 -0700 (PDT)
Received: by with HTTP; Mon, 4 Oct 2010 07:55:22 -0700 (PDT)
In-Reply-To: <>
References: <> <> <> <>
Date: Mon, 04 Oct 2010 10:55:22 -0400
X-Google-Sender-Auth: uA70TTq_YPlfNRFCxMwMuFDTlA8
Message-ID: <>
From: Benny Halevy <>
To: Thomas Haynes <>
Content-Type: text/plain; charset="ISO-8859-1"
Cc: "J. Bruce Fields" <>, "Erasani, Pranoop" <>,
Subject: Re: [nfsv4] New version of sparse draft(draft-hildebrand-nfsv4-read-sparse-01.txt)
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 04 Oct 2010 14:54:34 -0000

On Sun, Oct 3, 2010 at 10:54 PM, Thomas Haynes <> wrote:
> The harder case is for a non-clustered pNFS community, i.e., one
> where the MDS and DSes need a control protocol to talk. Let's
> assume that is the case here.
> A hole is created when a WRITE happens at a DS that is not
> contiguous with the previous (if any) WRITE.
> If a DS is supposed to return the range until the next non-zero
> data, then it will need to contact every other DS or perhaps
> the MDS.
> If the MDS is supposed to honor a GET_SPARSE_BLOCK_MAP
> type of request from the client, then either each DS would have
> to send hole entries to the MDS or the MDS would have to
> probe each one.
> What if instead of a DS returning information about other DSes,
> it simply returns information about what it knows? I.e., my next
> data is at address X.

I think that this is much cleaner from the protocol perspective.
The DS is essentially serving NFSv4.1 so its READ result applies
to the current filehandle which represents the slice stored
on it, not the logical file striped across multiple DSs.


> Consider a file with a width of 32k. The first write it gets is at 1M
> and then it gets 5 other contiguous writes. The next data is at
> 2M and it gets 4 other contiguous writes. The DS can't assume
> that any other DS has zeros in the first 1M. Worst case, all other
> N-1 DSes do have data there.
> In that scenario, N-1 DSes will return 32k on the first read and
> 1 DS will return a hole for it only until 1M. The client will then
> only send N-1 READs until it it gets to the 1M mark.
> _______________________________________________
> nfsv4 mailing list