Re: [nfsv4] LAYOUTCOMMIT discussion

<Noveck_David@emc.com> Wed, 28 July 2010 06:30 UTC

Return-Path: <Noveck_David@emc.com>
X-Original-To: nfsv4@core3.amsl.com
Delivered-To: nfsv4@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 06DC33A683C for <nfsv4@core3.amsl.com>; Tue, 27 Jul 2010 23:30:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0LPxBd+YjDh6 for <nfsv4@core3.amsl.com>; Tue, 27 Jul 2010 23:30:54 -0700 (PDT)
Received: from mexforward.lss.emc.com (mexforward.lss.emc.com [128.222.32.20]) by core3.amsl.com (Postfix) with ESMTP id 3B96E3A69DC for <nfsv4@ietf.org>; Tue, 27 Jul 2010 23:30:50 -0700 (PDT)
Received: from hop04-l1d11-si03.isus.emc.com (HOP04-L1D11-SI03.isus.emc.com [10.254.111.23]) by mexforward.lss.emc.com (Switch-3.3.2/Switch-3.1.7) with ESMTP id o6S6VDju019821 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 28 Jul 2010 02:31:13 -0400
Received: from mailhub.lss.emc.com (nagas.lss.emc.com [10.254.144.15]) by hop04-l1d11-si03.isus.emc.com (RSA Interceptor); Wed, 28 Jul 2010 02:31:10 -0400
Received: from corpussmtp5.corp.emc.com (corpussmtp5.corp.emc.com [128.221.166.229]) by mailhub.lss.emc.com (Switch-3.4.2/Switch-3.3.2mp) with ESMTP id o6S6VAKq011657; Wed, 28 Jul 2010 02:31:10 -0400
Received: from CORPUSMX50A.corp.emc.com ([128.221.62.43]) by corpussmtp5.corp.emc.com with Microsoft SMTPSVC(6.0.3790.4675); Wed, 28 Jul 2010 02:31:10 -0400
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Wed, 28 Jul 2010 02:31:03 -0400
Message-ID: <BF3BB6D12298F54B89C8DCC1E4073D8001FDACE4@CORPUSMX50A.corp.emc.com>
In-Reply-To: <E7372E66F45B51429E249BF556CEFFBC0D6423F2@RTPMVEXC1-PRD.hq.netapp.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [nfsv4] LAYOUTCOMMIT discussion
Thread-Index: Acstwh55BeH8y0ftQo6OSRtHoXS9hQAWm6bgAABaoqA=
References: <BF3BB6D12298F54B89C8DCC1E4073D8001F6CA5F@CORPUSMX50A.corp.emc.com> <E7372E66F45B51429E249BF556CEFFBC0D6423F2@RTPMVEXC1-PRD.hq.netapp.com>
From: Noveck_David@emc.com
To: Craig.Everhart@netapp.com, nfsv4@ietf.org
X-OriginalArrivalTime: 28 Jul 2010 06:31:10.0363 (UTC) FILETIME=[77C72EB0:01CB2E1E]
X-EMM-EM: Active
Subject: Re: [nfsv4] LAYOUTCOMMIT discussion
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Jul 2010 06:30:56 -0000

Good point.

There are certainly clustered file systems where none of this would be
required.

My intention was to specify a minimal set of occasions when the
attribute updates occurred.  Having it happen more frequently, or
always, as would be the case in a clustered file system, was certainly
intended to be allowed.  I guess there should be some text explaining
that.

 

-----Original Message-----
From: Everhart, Craig [mailto:Craig.Everhart@netapp.com] 
Sent: Wednesday, July 28, 2010 2:26 AM
To: Noveck, David; nfsv4@ietf.org
Subject: RE: [nfsv4] LAYOUTCOMMIT discussion

Dave's great points about whether LAYOUTCOMMIT should be an implicit
part of implicit CLOSEs are useful, and I agree we ought to talk about
them.

As I understood at least one discussion on Monday around the lunch
table, the files-oriented version of pNFS felt that LAYOUTCOMMIT might
be optional if implemented on some kinds of file systems, in particular
many clustered back ends.  As I recall, the concern was whether simply
doing WRITEs and COMMITs to the file-oriented data servers would simply
sufficiently update the mtime and change attributes, as returned by the
MDS, or whether a separate LAYOUTCOMMIT was necessary to communicate
those changes back to the MDS.

If that's a reasonable capturing of the semantic issue involved, then it
may be straightforward to propose prose and semantics to cover it, such
as an option that the server can tell the files-oriented client saying
that LAYOUTCOMMITs are not required after committed writes to the data
servers in a given MDS/DS regime.

I look forward to discussions on these points, perhaps in person here in
Maastricht.

		Craig

> -----Original Message-----
> From: Noveck_David@emc.com [mailto:Noveck_David@emc.com]
> Sent: Tuesday, July 27, 2010 9:30 PM
> To: nfsv4@ietf.org
> Subject: [nfsv4] LAYOUTCOMMIT discussion
> 
> I'd like to make sure we pursue this and come to some sort of
> conclusion
> or at least make progress toward a solution.
> 
> One problem with LAYOUTCOMMIT description is that the text is very
> block-oriented.  That's a result of the fact that 80% of the
> functionality is appropriate to block layouts but it needs to explain,
> what is and what isn't done in the case of file layouts.  A particular
> example is loca_last_write_offset.  This is required because for block
> because you may write one block of a file when only one byte is part
of
> the file.  But, for file, the DS knows the last offset and allowing
you
> to specify a possibly invalid one is not appropriate.
> 
> But as far as the issue about when and if to use it, the troubling
part
> is the following last paragraph of section 13.10.
> 
>    The NFSv4.1 protocol only provides close-to-open file data cache
>    semantics;
> 
> This is NOT what the v4.1 protocol provides.  It is up to the client
to
> determine if it caches data and many clients, don't for good reasons.
> As far as the protocol goes, leaving pNFS aside, if you do a write the
> change attribute will change and it supposed do.
> 
>    meaning that when the file is closed, all modified data is
>    written to the server.  When a subsequent OPEN of the file is done,
>    the change attribute is inspected for a difference from a cached
>    value for the change attribute.  For the case above, this means
that
>    a LAYOUTCOMMIT will be done at close (along with the data WRITEs)
> and
>    will update the file's size and change attribute.  Access from
>    another client after that point will result in the appropriate size
>    being returned.
> 
> "a LAYOUTCOMMIT 'will be done' at close".  Not MUST or SHOULD.  And
the
> text suggests that the client adds this op as opposed to it being part
> of CLOSE, but it isn't absolutely clear about that.
> 
> The reason that it is better for this to be automatic part of CLOSE,
is
> that there are going to be instances in which the file is closed when
> no
> CLOSE is done (e.g. lease expiration), and the server had better do a
> LAYOUTCOMMIT at that point or everything is a mess.
> 
> So here is my proposal for what should be written in this place:
> 
>     Many uses of the NFSv4.1 protocol are based on close-to-open
>     caching semantics and it is a requirement that the modify time
>     is updated frequently enough to support this.  In the case of
>     the file layout type, when there have been writes done through
>     a file layout on a file being closed, the client SHOULD do a
>     LAYOUTCOMMIT after completion of all the writes and before the
>     CLOSE.  This will ensure that the change attribute and file size
>     are updated appropriately so that access from another client
>     after that point will result in the appropriate attributes
>     being returned.
> 
>     When a file open for write is closed a result of a lease
expiration,
> 
>     and the client has a layout for that file, the server must perform
>     the equivalent of a LAYOUTCOMMIT before closing the file,
typically
>     by fetching current attributes from the DS, to ensure that the
> file's
>     attributes are properly updated.  When a lease expiration involves
>     multiple files open for write for which there are associated
>     layouts, the server is free to fetch attributes for multiple
>     file from a DS in a single control protocol request.
> 
>     When a lease is expired but the close does not occur immediately
>     the server still SHOULD perform the equivalent of a LAYOUTCOMMIT
>     at the point of lease expiration.
> 
>     In many applications, there are files that essentially are never
>     closed.  In order to provide more meaningful attribute values in
>     such environments, the server SHOULD ensure that the propagation
>     of attributes (size, change, etc.) is done at least every lease
>     period.  Because attribute updates for many file from a single DS
>     could be gathered in a single request, the resources to do this
>     not very large.
> 
>  Comments?
> 
> _______________________________________________
> nfsv4 mailing list
> nfsv4@ietf.org
> https://www.ietf.org/mailman/listinfo/nfsv4