Re: [nfsv4] 4.1 client - LAYOUTCOMMIT & close

"Sandeep Joshi" <sjoshi@bluearc.com> Fri, 09 July 2010 00:04 UTC

Return-Path: <sjoshi@bluearc.com>
X-Original-To: nfsv4@core3.amsl.com
Delivered-To: nfsv4@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id CF4B43A67AF for <nfsv4@core3.amsl.com>; Thu, 8 Jul 2010 17:04:35 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yx3PLmTbpL73 for <nfsv4@core3.amsl.com>; Thu, 8 Jul 2010 17:04:32 -0700 (PDT)
Received: from p01c12o142.mxlogic.net (p01c12o142.mxlogic.net [208.65.145.65]) by core3.amsl.com (Postfix) with ESMTP id AB2B53A68FB for <nfsv4@ietf.org>; Thu, 8 Jul 2010 17:04:31 -0700 (PDT)
Received: from unknown [63.81.2.132] (EHLO p01c12o142.mxlogic.net) by p01c12o142.mxlogic.net(mxl_mta-6.7.0-0) with ESMTP id 497663c4.73ae6940.32029.00-536.74364.p01c12o142.mxlogic.net (envelope-from <sjoshi@bluearc.com>); Thu, 08 Jul 2010 18:04:36 -0600 (MDT)
X-MXL-Hash: 4c3667941809a6b8-f22054e54732b7bb9258bc3819a11c63dd1c4555
Received: from unknown [63.81.2.132] (EHLO us-email.terastack.bluearc.com) by p01c12o142.mxlogic.net(mxl_mta-6.7.0-0) with ESMTP id 777663c4.0.31952.00-354.74150.p01c12o142.mxlogic.net (envelope-from <sjoshi@bluearc.com>); Thu, 08 Jul 2010 18:04:11 -0600 (MDT)
X-MXL-Hash: 4c36677b2e3034c2-e36a752b0ca3a671cfb4489bddf50ffad292b766
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Thu, 08 Jul 2010 17:03:58 -0700
Message-ID: <A062FCC8662DA848949F7C3046B9BEAE02A24823@us-email.terastack.bluearc.com>
In-Reply-To: <1278623771.13551.54.camel@heimdal.trondhjem.org>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [nfsv4] 4.1 client - LAYOUTCOMMIT & close
Thread-Index: Acse4trsKMbSKVgfQGePN+8Z2n4ccAADViaQ
References: <A062FCC8662DA848949F7C3046B9BEAE01F3A6ED@us-email.terastack.bluearc.com><6206CE0E-0A32-46A7-B648-3FCC12ED1961@netapp.com><B9A709F368FAAF4DB4B33870F72A141DFB88F3@CORPUSMX30A.corp.emc.com><0E2B1FE3-3B42-4BF2-BECE-A611DADF3983@netapp.com><B9A709F368FAAF4DB4B33870F72A141D01017F94@CORPUSMX30A.corp.emc.com><1278448834.16176.5.camel@heimdal.trondhjem.org><4C346D80.8010405@panasas.com><1278507985.2804.30.camel@heimdal.trondhjem.org><1278508696.2804.35.camel@heimdal.trondhjem.org><4C348679.6010507@panasas.com><1278511416.2804.52.camel@heimdal.trondhjem.org><B9A709F368FAAF4DB4B33870F72A141D0106B6B0@CORPUSMX30A.corp.emc.com><1278536484.12889.4.camel@heimdal.trondhjem.org><BF3BB6D12298F54B89C8DCC1E4073D8001ADDDA5@CORPUSMX50A.corp.emc.com><C2D311A6F086424F99E385949ECFEBCB030F2A80@CORPUSMX80B.corp.emc.com><1278543175.15524.2.camel@heimdal.trondhjem.org><1278544149.15524.15.camel@heimdal.trondhjem.org><1278544497.15524.17.camel@heimdal.trondhje! m .org>< 4C35F5E3.3000604@panasa s.com><C 2D311A6F0864 24F99E385949ECFEBCB030F2EBF@CORPUSMX80B.corp.emc.com> <1278623771.13551.54.camel@heimdal.trondhjem.org>
From: Sandeep Joshi <sjoshi@bluearc.com>
To: Trond Myklebust <trond.myklebust@fys.uio.no>, david.black@emc.com
X-Spam: [F=0.2000000000; CM=0.500; S=0.200(2010070601)]
X-MAIL-FROM: <sjoshi@bluearc.com>
X-SOURCE-IP: [63.81.2.132]
X-AnalysisOut: [v=1.0 c=1 a=_O42kexRTZAA:10 a=0qYQvVkOOIcA:10 a=VphdPIyG4k]
X-AnalysisOut: [EA:10 a=kj9zAlcOel0A:10 a=4m6HPYH2f5oGN3e34rXPbw==:17 a=48]
X-AnalysisOut: [vgC7mUAAAA:8 a=G0_B3m8xAAAA:8 a=VwQbUJbxAAAA:8 a=cOLrLDMxA]
X-AnalysisOut: [AAA:8 a=JDjsHSkAAAAA:8 a=pGLkceISAAAA:8 a=AIogHriwxxuRm5eR]
X-AnalysisOut: [lnMA:9 a=wmC8bQpXb6VN_mO0sY0A:7 a=mGDqF3P25cXQYIYQz4qMgUrj]
X-AnalysisOut: [VC0A:4 a=CjuIK1q_8ugA:10 a=x8gzFH9gYPwA:10 a=lZB815dzVvQA:]
X-AnalysisOut: [10 a=lW_bInUQU2sA:10 a=3QEBi2iB_nEA:10 a=Hf6muOzgCGQA:10 a]
X-AnalysisOut: [=MSl-tDqOz04A:10 a=dZaQKVOb2ME896Go:21 a=0C6jqyN3imPKfP9R:]
X-AnalysisOut: [21]
Cc: linux-nfs@vger.kernel.org, garth@panasas.com, welch@panasas.com, nfsv4@ietf.org, andros@netapp.com, bhalevy@panasas.com
Subject: Re: [nfsv4] 4.1 client - LAYOUTCOMMIT & close
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 09 Jul 2010 00:04:36 -0000

It seems like we agree that laycommit will be sent for file layout,
correct?
Or Should I file a defect on this?

For reference my original email below.

// START
In certain cases, I don't see layoutcommit on a file at all even after
doing many writes.



Client side operations:

open
write(s)
close


On server side (observed operations):

open
layoutget's
close


But, I do not see laycommit at all. In terms data written by client it
is about 4-5MB.

When does client issue laycommit? 
 
// END


Regards,

Sandeep

-----Original Message-----
From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] On Behalf
Of Trond Myklebust
Sent: Thursday, July 08, 2010 2:16 PM
To: david.black@emc.com
Cc: linux-nfs@vger.kernel.org; garth@panasas.com; welch@panasas.com;
nfsv4@ietf.org; andros@netapp.com; bhalevy@panasas.com
Subject: Re: [nfsv4] 4.1 client - LAYOUTCOMMIT & close

On Thu, 2010-07-08 at 16:30 -0400, david.black@emc.com wrote:
> > Note that a LAYOUTRETURN can arrive without LAYOUTCOMMIT if the 
> > client hasn't written to the file.  I'm not sure what about the 
> > blocks case though, do you implicitly free up any provisionally 
> > allocated blocks that the client had not explicitly committed using
LAYOUTCOMMIT?
> 
> In principle, yes as the blocks are no longer promised to the client, 
> although lazy evaluation of this is an obvious optimization.
> 
> > >> "Upon receiving an OPEN, LOCK or a WANT_DELEGATION, the server 
> > >> must check that it has received LAYOUTCOMMITs from any other 
> > >> clients that may have the file open for writing. If it hasn't, 
> > >> then it MUST take some action to ensure that any file data 
> > >> changes are accompanied by a change
> > >                            ^ potentially visible
> > >> attribute update."
> > 
> > That should be OK as long as it's not for every GETATTR for the 
> > change, mtime, or size attributes.
> > 
> > >>
> > >> Then you can add the above suggestion without the offending 
> > >> caveat. Note however that it does break the "SHOULD NOT" 
> > >> admonition in section 18.32.4.
> > 
> > Better be safe than sorry in this rare error case.
> 
> I concur with Benny on both of the above - in essence, the unrecovered
client failure is a reason to potentially ignore the "SHOULD" (server
can't know whether it actually ignored the "SHOULD", hence better safe
than sorry).  We probably ought to find a someplace appropriate to add a
paragraph or two explaining this in one of the 4.2 documents.

Right. I'm only interested in fixing the close-to-open case. The case of
general GETATTR calls might be nice to fix too, but it should not be
essential in order to ensure that well-behaved applications continue to
work as expected.

Note, however, that legacy support for stateless protocols like NFSv2
and NFSv3 may be problematic: there is no equivalent of OPEN, and so the
server may have to do the above check on all NFSPROC2_GETATTR,
NFSPROC3_GETATTR, NFSPROC2_LOOKUP and NFSPROC3_LOOKUP requests.

   Trond

> Thanks,
> --David
> 
> 
> > -----Original Message-----
> > From: Benny Halevy [mailto:bhalevy.lists@gmail.com] On Behalf Of 
> > Benny Halevy
> > Sent: Thursday, July 08, 2010 12:00 PM
> > To: Trond Myklebust
> > Cc: Black, David; Noveck, David; Muntz, Daniel; 
> > linux-nfs@vger.kernel.org; garth@panasas.com; welch@panasas.com; 
> > nfsv4@ietf.org; andros@netapp.com
> > Subject: Re: [nfsv4] 4.1 client - LAYOUTCOMMIT & close
> > 
> > On Jul. 08, 2010, 2:14 +0300, Trond Myklebust
<trond.myklebust@fys.uio.no> wrote:
> > > On Wed, 2010-07-07 at 19:09 -0400, Trond Myklebust wrote:
> > >> On Wed, 2010-07-07 at 18:52 -0400, Trond Myklebust wrote:
> > >>> On Wed, 2010-07-07 at 18:44 -0400, david.black@emc.com wrote:
> > >>>> Let me try this ...
> > >>>>
> > >>>> A correct client will always send LAYOUTCOMMIT.
> > >>>> Assume that the client is correct.
> > >>>> Hence if the LAYOUTCOMMIT doesn't arrive, something's failed.
> > >>>>
> > >>>> Important implication: No LAYOUTCOMMIT is an error/failure 
> > >>>> case.  It just has to work; it doesn't have to be fast.
> > >>>>
> > 
> > Note that a LAYOUTRETURN can arrive without LAYOUTCOMMIT if the 
> > client hasn't written to the file.  I'm not sure what about the 
> > blocks case though, do you implicitly free up any provisionally 
> > allocated blocks that the client had not explicitly committed using
LAYOUTCOMMIT?
> > 
> > >>>> Suggestion: If a client dies while holding writeable layouts 
> > >>>> that permit write-in-place, and the client doesn't reappear or 
> > >>>> doesn't reclaim those layouts, then the server should assume 
> > >>>> that the files involved were written before the client died, 
> > >>>> and set the file attributes accordingly as part of internally 
> > >>>> reclaiming the layout that the client has abandoned.
> > 
> > Of course. That's part of the server recovery.
> > 
> > >>>>
> > >>>> Caveat: It may take a while for the server to determine that 
> > >>>> the client has abandoned a layout.
> > 
> > That's two lease times after a respective CB_LAYOUTRECALL.
> > 
> > >>>>
> > >>>> This can result in false positives (file appears to be modified

> > >>>> when it
> > >>>> wasn't) but won't yield false negatives (file does not appear 
> > >>>> to be modified even though it was modified).
> > >>>
> > >>> OK... So we're going to have to turn off client side file 
> > >>> caching entirely for pNFS? I can do that...
> > >>>
> > >>> The above won't work. Think readahead...
> > >>
> > >> So... What can work, is if you modify it to work explicitly for 
> > >> close-to-open
> > >>
> > >> "Upon receiving an OPEN, LOCK or a WANT_DELEGATION, the server 
> > >> must check that it has received LAYOUTCOMMITs from any other 
> > >> clients that may have the file open for writing. If it hasn't, 
> > >> then it MUST take some action to ensure that any file data 
> > >> changes are accompanied by a change
> > >                            ^ potentially visible
> > >> attribute update."
> > 
> > That should be OK as long as it's not for every GETATTR for the 
> > change, mtime, or size attributes.
> > 
> > >>
> > >> Then you can add the above suggestion without the offending 
> > >> caveat. Note however that it does break the "SHOULD NOT" 
> > >> admonition in section 18.32.4.
> > 
> > Better be safe than sorry in this rare error case.
> > 
> > Benny
> > 
> > >>
> > >> Trond
> > >>
> > >>
> > >>> Trond
> > >>>
> > >>>> Thanks,
> > >>>> --David
> > >>>>
> > >>>>> -----Original Message-----
> > >>>>> From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] 
> > >>>>> On Behalf
> > >>>> Of Noveck_David@emc.com
> > >>>>> Sent: Wednesday, July 07, 2010 6:04 PM
> > >>>>> To: Trond.Myklebust@netapp.com; Muntz, Daniel
> > >>>>> Cc: linux-nfs@vger.kernel.org; garth@panasas.com; 
> > >>>>> welch@panasas.com;
> > >>>> nfsv4@ietf.org;
> > >>>>> andros@netapp.com; bhalevy@panasas.com
> > >>>>> Subject: Re: [nfsv4] 4.1 client - LAYOUTCOMMIT & close
> > >>>>>
> > >>>>>> Yes. I would agree that the client cannot rely on the updates

> > >>>>>> being
> > >>>> made
> > >>>>>> visible if it fails to send the LAYOUTCOMMIT. My point was 
> > >>>>>> simply
> > >>>> that a
> > >>>>>> compliant server MUST also have a valid strategy for dealing 
> > >>>>>> with
> > >>>> the
> > >>>>>> case where the client doesn't send it.
> > >>>>>
> > >>>>> So you are saying the updates "MUST be made visible" through 
> > >>>>> the server's valid strategy.  Is that right.
> > >>>>>
> > >>>>> And that the client cannot rely on that.  Why not, if the 
> > >>>>> server must have a valid strategy.
> > >>>>>
> > >>>>> Is this just prudent "belt and suspenders" design or what?
> > >>>>>
> > >>>>> It seems to me that if one side here is MUST (and the spec 
> > >>>>> needs to be clearer about what might or might not constitute a

> > >>>>> valid strategy),
> > >>>> then
> > >>>>> the other side should be SHOULD.
> > >>>>>
> > >>>>> If both sides are "MUST", then if things don't work out then 
> > >>>>> the
> > >>>> client
> > >>>>> and server can equally point to one another and say "It's his
fault".
> > >>>>>
> > >>>>> Am I missing something here?
> > >>>>>
> > >>>>>
> > >>>>>
> > >>>>> -----Original Message-----
> > >>>>> From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] 
> > >>>>> On Behalf Of Trond Myklebust
> > >>>>> Sent: Wednesday, July 07, 2010 5:01 PM
> > >>>>> To: Muntz, Daniel
> > >>>>> Cc: linux-nfs@vger.kernel.org; garth@panasas.com; 
> > >>>>> welch@panasas.com; nfsv4@ietf.org; andros@netapp.com; 
> > >>>>> bhalevy@panasas.com
> > >>>>> Subject: Re: [nfsv4] 4.1 client - LAYOUTCOMMIT & close
> > >>>>>
> > >>>>> On Wed, 2010-07-07 at 16:39 -0400, Daniel.Muntz@emc.com wrote:
> > >>>>>> To bring this discussion full circle, since we agree that a
> > >>>> compliant
> > >>>>>> server can implement a scheme where written data does not 
> > >>>>>> become
> > >>>>> visible
> > >>>>>> until after a LAYOUTCOMMIT, do we also agree that 
> > >>>>>> LAYOUTCOMMIT is a "MUST" from a compliant client (independent
of layout type)?
> > >>>>>
> > >>>>> Yes. I would agree that the client cannot rely on the updates 
> > >>>>> being
> > >>>> made
> > >>>>> visible if it fails to send the LAYOUTCOMMIT. My point was 
> > >>>>> simply that
> > >>>> a
> > >>>>> compliant server MUST also have a valid strategy for dealing 
> > >>>>> with the case where the client doesn't send it.
> > >>>>>
> > >>>>> Cheers
> > >>>>>   Trond
> > >>>>>
> > >>>>>>   -Dan
> > >>>>>>
> > >>>>>>> -----Original Message-----
> > >>>>>>> From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org]

> > >>>>>>> On Behalf Of Trond Myklebust
> > >>>>>>> Sent: Wednesday, July 07, 2010 7:04 AM
> > >>>>>>> To: Benny Halevy
> > >>>>>>> Cc: andros@netapp.com; linux-nfs@vger.kernel.org; Garth 
> > >>>>>>> Gibson; Brent Welch; NFSv4
> > >>>>>>> Subject: Re: [nfsv4] 4.1 client - LAYOUTCOMMIT & close
> > >>>>>>>
> > >>>>>>> On Wed, 2010-07-07 at 16:51 +0300, Benny Halevy wrote:
> > >>>>>>>> On Jul. 07, 2010, 16:18 +0300, Trond Myklebust
> > >>>>>>> <Trond.Myklebust@netapp.com> wrote:
> > >>>>>>>>> On Wed, 2010-07-07 at 09:06 -0400, Trond Myklebust wrote:
> > >>>>>>>>>> On Wed, 2010-07-07 at 15:05 +0300, Benny Halevy wrote:
> > >>>>>>>>>>> On Jul. 06, 2010, 23:40 +0300, Trond Myklebust
> > >>>>>>> <trond.myklebust@fys.uio.no> wrote:
> > >>>>>>>>>>>> On Tue, 2010-07-06 at 15:20 -0400, Daniel.Muntz@emc.com
> > >>>>> wrote:
> > >>>>>>>>>>>>> The COMMIT to the DS, ttbomk, commits data on the DS. 
> > >>>>>>>>>>>>> I
> > >>>> see it as
> > >>>>>>>>>>>>> orthogonal to updating the metadata on the MDS (but
> > >>>> perhaps I'm wrong).
> > >>>>>>>>>>>>> As sjoshi@bluearc mentioned, the LAYOUTCOMMIT provides

> > >>>>>>>>>>>>> a
> > >>>> synchronization
> > >>>>>>>>>>>>> point, so even if the non-clustered server does not 
> > >>>>>>>>>>>>> want
> > >>>> to update
> > >>>>>>>>>>>>> metadata on every DS I/O, the LAYOUTCOMMIT could also 
> > >>>>>>>>>>>>> be a
> > >>>> trigger to
> > >>>>>>>>>>>>> execute whatever synchronization mechanism the 
> > >>>>>>>>>>>>> implementer
> > >>>> wishes to put
> > >>>>>>>>>>>>> in the control protocol.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>> As far as I'm aware, there are no exceptions in RFC5661
> > >>>> that would allow
> > >>>>>>>>>>>> pNFS servers to break the rule that any visible change 
> > >>>>>>>>>>>> to
> > >>>> the data must
> > >>>>>>>>>>>> be atomically accompanied with a change attribute
update.
> > >>>>>>>>>>>>
> > >>>>>>>>>>>
> > >>>>>>>>>>> Trond, I'm not sure how this rule you mentioned is
> > >>>> specified.
> > >>>>>>>>>>>
> > >>>>>>>>>>> See more in section 12.5.4 and 12.5.4.1. LAYOUTCOMMIT 
> > >>>>>>>>>>> and
> > >>>> change/time_modify
> > >>>>>>>>>>> in particular:
> > >>>>>>>>>>>
> > >>>>>>>>>>>    For some layout protocols, the storage device is able

> > >>>>>>>>>>> to
> > >>>> notify the
> > >>>>>>>>>>>    metadata server of the occurrence of an I/O; as a 
> > >>>>>>>>>>> result,
> > >>>> the change
> > >>>>>>>>>>>    and time_modify attributes may be updated at the 
> > >>>>>>>>>>> metadata
> > >>>> server.
> > >>>>>>>>>>>    For a metadata server that is capable of monitoring
> > >>>> updates to the
> > >>>>>>>>>>>    change and time_modify attributes, LAYOUTCOMMIT
> > >>>> processing is not
> > >>>>>>>>>>>    required to update the change attribute.  In this 
> > >>>>>>>>>>> case,
> > >>>> the metadata
> > >>>>>>>>>>>    server must ensure that no further update to the data

> > >>>>>>>>>>> has
> > >>>> occurred
> > >>>>>>>>>>>    since the last update of the attributes; file-based
> > >>>> protocols may
> > >>>>>>>>>>>    have enough information to make this determination or

> > >>>>>>>>>>> may
> > >>>> update the
> > >>>>>>>>>>>    change attribute upon each file modification.  This 
> > >>>>>>>>>>> also
> > >>>> applies for
> > >>>>>>>>>>>    the time_modify attribute.  If the server 
> > >>>>>>>>>>> implementation
> > >>>> is able to
> > >>>>>>>>>>>    determine that the file has not been modified since 
> > >>>>>>>>>>> the
> > >>>> last
> > >>>>>>>>>>>    time_modify update, the server need not update
> > >>>> time_modify at
> > >>>>>>>>>>>    LAYOUTCOMMIT.  At LAYOUTCOMMIT completion, the 
> > >>>>>>>>>>> updated
> > >>>> attributes
> > >>>>>>>>>>>    should be visible if that file was modified since the
> > >>>> latest previous
> > >>>>>>>>>>>    LAYOUTCOMMIT or LAYOUTGET
> > >>>>>>>>>>
> > >>>>>>>>>> I know. However the above paragraph does not state that 
> > >>>>>>>>>> the
> > >>>> server
> > >>>>>>>>>> should make those changes visible to clients other than 
> > >>>>>>>>>> the
> > >>>> one that is
> > >>>>>>>>>> writing.
> > >>>>>>>>>>
> > >>>>>>>>>> Section 18.32.4 states that writes will cause the
> > >>>> time_modified and
> > >>>>>>>>>> change attributes to be updated (if and only if the file 
> > >>>>>>>>>> data
> > >>>> is
> > >>>>>>>>>> modified). Several other sections rely on this behaviour,
> > >>>> including
> > >>>>>>>>>> section 10.3.1, section 11.7.2.2, and section 11.7.7.
> > >>>>>>>>>>
> > >>>>>>>>>> The only 'special behaviour' that I see allowed for pNFS 
> > >>>>>>>>>> is
> > >>>> in section
> > >>>>>>>>>> 13.10, which states that clients can't expect to see 
> > >>>>>>>>>> changes immediately, but that they must be able to expect
> > >>>> close-to-open
> > >>>>>>>>>> semantics to work. Again, if this is to be the case, then

> > >>>>>>>>>> the
> > >>>> server
> > >>>>>>>>>> _must_ be able to deal with the case where client 1 dies
> > >>>> before it can
> > >>>>>>>>>> issue the LAYOUTCOMMIT.
> > >>>>>>>>
> > >>>>>>>> Agreed.
> > >>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>
> > >>>>>>>>>>>> As I see it, if your server allows one client to read 
> > >>>>>>>>>>>> data
> > >>>> that may have
> > >>>>>>>>>>>> been modified by another client that holds a WRITE 
> > >>>>>>>>>>>> layout
> > >>>> for that range
> > >>>>>>>>>>>> then (since that is a visible data change) it should
> > >>>> provide a change
> > >>>>>>>>>>>> attribute update irrespective of whether or not a
> > >>>> LAYOUTCOMMIT has been
> > >>>>>>>>>>>> sent.
> > >>>>>>>>>>>
> > >>>>>>>>>>> the requirement for the server in WRITE's implementation
> > >>>> section
> > >>>>>>>>>>> is quite weak: "It is assumed that the act of writing 
> > >>>>>>>>>>> data
> > >>>> to a file will
> > >>>>>>>>>>> cause the time_modified and change attributes of the 
> > >>>>>>>>>>> file to
> > >>>> be updated."
> > >>>>>>>>>>>
> > >>>>>>>>>>> The difference here is that for pNFS the written data is

> > >>>>>>>>>>> not
> > >>>> guaranteed
> > >>>>>>>>>>> to be visible until LAYOUTCOMMIT.  In a broader sense,
> > >>>> assuming the clients
> > >>>>>>>>>>> are caching dirty data and use a write-behind cache,
> > >>>> application-written data
> > >>>>>>>>>>> may be visible to other processes on the same host but 
> > >>>>>>>>>>> not
> > >>>> to others until
> > >>>>>>>>>>> fsync() or close() - open-to-close semantics are the 
> > >>>>>>>>>>> only
> > >>>> thing the client
> > >>>>>>>>>>> guarantees, right?  Issuing LAYOUTCOMMIT on fsync() and
> > >>>> close() ensure the
> > >>>>>>>>>>> data is committed to stable storage and is visible to 
> > >>>>>>>>>>> all
> > >>>> other clients in
> > >>>>>>>>>>> the cluster.
> > >>>>>>>>>>
> > >>>>>>>>>> See above. I'm not disputing your statement that 'the 
> > >>>>>>>>>> written
> > >>>> data is
> > >>>>>>>>>> not guaranteed to be visible until LAYOUTCOMMIT'. I am
> > >>>> disputing an
> > >>>>>>>>>> assumption that 'the written data may be visible without 
> > >>>>>>>>>> an
> > >>>> accompanying
> > >>>>>>>>>> change attribute update'.
> > >>>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> In other words, I'd expect the following scenario to give 
> > >>>>>>>>> the
> > >>>> same
> > >>>>>>>>> results in NFSv4.1 w/pNFS as it does in NFSv4:
> > >>>>>>>>
> > >>>>>>>> That's a strong requirement that may limit the scalability 
> > >>>>>>>> of
> > >>>> the server.
> > >>>>>>>>
> > >>>>>>>> The spirit of the pNFS operations, at least from Panasas
> > >>>> perspective was that
> > >>>>>>>> the data is transient until LAYOUTCOMMIT, meaning it may or

> > >>>>>>>> may
> > >>>> not be visible
> > >>>>>>>> to clients other than the one who wrote it, and its 
> > >>>>>>>> associated
> > >>>> metadata MUST
> > >>>>>>>> be updated and describe the new data only on LAYOUTCOMMIT 
> > >>>>>>>> and
> > >>>> until then it's
> > >>>>>>>> undefined, i.e. it's up to the server implementation 
> > >>>>>>>> whether to
> > >>>> update it or not.
> > >>>>>>>>
> > >>>>>>>> Without locking, what do the stronger semantics buy you?
> > >>>>>>>> Even if a client verified the change_attribute new data may
> > >>>> become visible
> > >>>>>>>> at any time after the GETATTR if the file/byte range aren't
> > >>>> locked.
> > >>>>>>>
> > >>>>>>> There is no locking needed in the scenario below: it is 
> > >>>>>>> ordinary close-to-open semantics.
> > >>>>>>>
> > >>>>>>> The point is that if you remove the one and only way that 
> > >>>>>>> clients
> > >>>> have
> > >>>>>>> to determine whether or not their data caches are valid, 
> > >>>>>>> then they
> > >>>> can
> > >>>>>>> no longer cache data at all, and server scalability will be 
> > >>>>>>> shot
> > >>>> to
> > >>>>>>> smithereens anyway.
> > >>>>>>>
> > >>>>>>> Trond
> > >>>>>>>
> > >>>>>>>> Benny
> > >>>>>>>>
> > >>>>>>>>>
> > >>>>>>>>> Client 1			Client 2
> > >>>>>>>>> ========			========
> > >>>>>>>>>
> > >>>>>>>>> OPEN foo
> > >>>>>>>>> READ
> > >>>>>>>>> CLOSE
> > >>>>>>>>> 				OPEN
> > >>>>>>>>> 				LAYOUTGET ...
> > >>>>>>>>> 				WRITE via DS
> > >>>>>>>>> 				<dies>...
> > >>>>>>>>> OPEN foo
> > >>>>>>>>> verify change_attr
> > >>>>>>>>> READ if above WRITE is visible CLOSE
> > >>>>>>>>>
> > >>>>>>>>> Trond
> > >>>>>>>>> _______________________________________________
> > >>>>>>>>> nfsv4 mailing list
> > >>>>>>>>> nfsv4@ietf.org
> > >>>>>>>>> https://www.ietf.org/mailman/listinfo/nfsv4
> > >>>>>>>
> > >>>>>>>
> > >>>>>>> _______________________________________________
> > >>>>>>> nfsv4 mailing list
> > >>>>>>> nfsv4@ietf.org
> > >>>>>>> https://www.ietf.org/mailman/listinfo/nfsv4
> > >>>>>>>
> > >>>>>>>
> > >>>>>
> > >>>>>
> > >>>>> _______________________________________________
> > >>>>> nfsv4 mailing list
> > >>>>> nfsv4@ietf.org
> > >>>>> https://www.ietf.org/mailman/listinfo/nfsv4
> > >>>>>
> > >>>>> _______________________________________________
> > >>>>> nfsv4 mailing list
> > >>>>> nfsv4@ietf.org
> > >>>>> https://www.ietf.org/mailman/listinfo/nfsv4
> > >>>>
> > >>>
> > >>>
> > >>
> > >>
> > >> --
> > >> To unsubscribe from this list: send the line "unsubscribe 
> > >> linux-nfs" in the body of a message to majordomo@vger.kernel.org 
> > >> More majordomo info at  
> > >> http://vger.kernel.org/majordomo-info.html
> > >
> > >
> > >
> > > --
> > > To unsubscribe from this list: send the line "unsubscribe 
> > > linux-nfs" in the body of a message to majordomo@vger.kernel.org 
> > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > 
> 
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4



_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www.ietf.org/mailman/listinfo/nfsv4