Re: [nfsv4] LAYOUTCOMMIT discussion

"Everhart, Craig" <Craig.Everhart@netapp.com> Wed, 28 July 2010 07:28 UTC

Return-Path: <Craig.Everhart@netapp.com>
X-Original-To: nfsv4@core3.amsl.com
Delivered-To: nfsv4@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 638DA28C0F3 for <nfsv4@core3.amsl.com>; Wed, 28 Jul 2010 00:28:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dpfcjfEQvbTc for <nfsv4@core3.amsl.com>; Wed, 28 Jul 2010 00:28:12 -0700 (PDT)
Received: from mx2.netapp.com (mx2.netapp.com [216.240.18.37]) by core3.amsl.com (Postfix) with ESMTP id F2B213A68E7 for <nfsv4@ietf.org>; Wed, 28 Jul 2010 00:28:11 -0700 (PDT)
X-IronPort-AV: E=Sophos;i="4.55,272,1278313200"; d="scan'208";a="413366379"
Received: from smtp2.corp.netapp.com ([10.57.159.114]) by mx2-out.netapp.com with ESMTP; 28 Jul 2010 00:28:19 -0700
Received: from sacrsexc2-prd.hq.netapp.com (sacrsexc2-prd.hq.netapp.com [10.99.115.28]) by smtp2.corp.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id o6S7SJio017673; Wed, 28 Jul 2010 00:28:19 -0700 (PDT)
Received: from rtprsexc1-prd.hq.netapp.com ([10.100.161.114]) by sacrsexc2-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 28 Jul 2010 00:28:19 -0700
Received: from RTPMVEXC1-PRD.hq.netapp.com ([10.100.161.112]) by rtprsexc1-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959); Wed, 28 Jul 2010 03:28:17 -0400
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Wed, 28 Jul 2010 03:28:15 -0400
Message-ID: <E7372E66F45B51429E249BF556CEFFBC0D6423F6@RTPMVEXC1-PRD.hq.netapp.com>
In-Reply-To: <BF3BB6D12298F54B89C8DCC1E4073D8001FDACE4@CORPUSMX50A.corp.emc.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [nfsv4] LAYOUTCOMMIT discussion
Thread-Index: Acstwh55BeH8y0ftQo6OSRtHoXS9hQAWm6bgAABaoqAAAgFSgA==
References: <BF3BB6D12298F54B89C8DCC1E4073D8001F6CA5F@CORPUSMX50A.corp.emc.com> <E7372E66F45B51429E249BF556CEFFBC0D6423F2@RTPMVEXC1-PRD.hq.netapp.com> <BF3BB6D12298F54B89C8DCC1E4073D8001FDACE4@CORPUSMX50A.corp.emc.com>
From: "Everhart, Craig" <Craig.Everhart@netapp.com>
To: Noveck_David@emc.com, nfsv4@ietf.org
X-OriginalArrivalTime: 28 Jul 2010 07:28:17.0743 (UTC) FILETIME=[72A7DDF0:01CB2E26]
Subject: Re: [nfsv4] LAYOUTCOMMIT discussion
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Jul 2010 07:28:13 -0000

Not to belabor, but I was trying to amplify not only by saying that in
some cases the implicit LAYOUTCOMMIT isn't required, but that there are
server architectures in which LAYOUTCOMMIT is never required in ordinary
operation either, and that there's some call for having clients freed
from the need to do the MDS communication overhead in such cases.

		Craig

> -----Original Message-----
> From: Noveck_David@emc.com [mailto:Noveck_David@emc.com]
> Sent: Wednesday, July 28, 2010 8:31 AM
> To: Everhart, Craig; nfsv4@ietf.org
> Subject: RE: [nfsv4] LAYOUTCOMMIT discussion
> 
> Good point.
> 
> There are certainly clustered file systems where none of this would be
> required.
> 
> My intention was to specify a minimal set of occasions when the
> attribute updates occurred.  Having it happen more frequently, or
> always, as would be the case in a clustered file system, was certainly
> intended to be allowed.  I guess there should be some text explaining
> that.
> 
> 
> 
> -----Original Message-----
> From: Everhart, Craig [mailto:Craig.Everhart@netapp.com]
> Sent: Wednesday, July 28, 2010 2:26 AM
> To: Noveck, David; nfsv4@ietf.org
> Subject: RE: [nfsv4] LAYOUTCOMMIT discussion
> 
> Dave's great points about whether LAYOUTCOMMIT should be an implicit
> part of implicit CLOSEs are useful, and I agree we ought to talk about
> them.
> 
> As I understood at least one discussion on Monday around the lunch
> table, the files-oriented version of pNFS felt that LAYOUTCOMMIT might
> be optional if implemented on some kinds of file systems, in
particular
> many clustered back ends.  As I recall, the concern was whether simply
> doing WRITEs and COMMITs to the file-oriented data servers would
simply
> sufficiently update the mtime and change attributes, as returned by
the
> MDS, or whether a separate LAYOUTCOMMIT was necessary to communicate
> those changes back to the MDS.
> 
> If that's a reasonable capturing of the semantic issue involved, then
> it
> may be straightforward to propose prose and semantics to cover it,
such
> as an option that the server can tell the files-oriented client saying
> that LAYOUTCOMMITs are not required after committed writes to the data
> servers in a given MDS/DS regime.
> 
> I look forward to discussions on these points, perhaps in person here
> in
> Maastricht.
> 
> 		Craig
> 
> > -----Original Message-----
> > From: Noveck_David@emc.com [mailto:Noveck_David@emc.com]
> > Sent: Tuesday, July 27, 2010 9:30 PM
> > To: nfsv4@ietf.org
> > Subject: [nfsv4] LAYOUTCOMMIT discussion
> >
> > I'd like to make sure we pursue this and come to some sort of
> > conclusion
> > or at least make progress toward a solution.
> >
> > One problem with LAYOUTCOMMIT description is that the text is very
> > block-oriented.  That's a result of the fact that 80% of the
> > functionality is appropriate to block layouts but it needs to
explain,
> > what is and what isn't done in the case of file layouts.  A
> particular
> > example is loca_last_write_offset.  This is required because for
> block
> > because you may write one block of a file when only one byte is part
> of
> > the file.  But, for file, the DS knows the last offset and allowing
> you
> > to specify a possibly invalid one is not appropriate.
> >
> > But as far as the issue about when and if to use it, the troubling
> part
> > is the following last paragraph of section 13.10.
> >
> >    The NFSv4.1 protocol only provides close-to-open file data cache
> >    semantics;
> >
> > This is NOT what the v4.1 protocol provides.  It is up to the client
> to
> > determine if it caches data and many clients, don't for good
reasons.
> > As far as the protocol goes, leaving pNFS aside, if you do a write
> the
> > change attribute will change and it supposed do.
> >
> >    meaning that when the file is closed, all modified data is
> >    written to the server.  When a subsequent OPEN of the file is
done,
> >    the change attribute is inspected for a difference from a cached
> >    value for the change attribute.  For the case above, this means
> that
> >    a LAYOUTCOMMIT will be done at close (along with the data WRITEs)
> > and
> >    will update the file's size and change attribute.  Access from
> >    another client after that point will result in the appropriate
> size
> >    being returned.
> >
> > "a LAYOUTCOMMIT 'will be done' at close".  Not MUST or SHOULD.  And
> the
> > text suggests that the client adds this op as opposed to it being
> part
> > of CLOSE, but it isn't absolutely clear about that.
> >
> > The reason that it is better for this to be automatic part of CLOSE,
> is
> > that there are going to be instances in which the file is closed
when
> > no
> > CLOSE is done (e.g. lease expiration), and the server had better do
a
> > LAYOUTCOMMIT at that point or everything is a mess.
> >
> > So here is my proposal for what should be written in this place:
> >
> >     Many uses of the NFSv4.1 protocol are based on close-to-open
> >     caching semantics and it is a requirement that the modify time
> >     is updated frequently enough to support this.  In the case of
> >     the file layout type, when there have been writes done through
> >     a file layout on a file being closed, the client SHOULD do a
> >     LAYOUTCOMMIT after completion of all the writes and before the
> >     CLOSE.  This will ensure that the change attribute and file size
> >     are updated appropriately so that access from another client
> >     after that point will result in the appropriate attributes
> >     being returned.
> >
> >     When a file open for write is closed a result of a lease
> expiration,
> >
> >     and the client has a layout for that file, the server must
> perform
> >     the equivalent of a LAYOUTCOMMIT before closing the file,
> typically
> >     by fetching current attributes from the DS, to ensure that the
> > file's
> >     attributes are properly updated.  When a lease expiration
> involves
> >     multiple files open for write for which there are associated
> >     layouts, the server is free to fetch attributes for multiple
> >     file from a DS in a single control protocol request.
> >
> >     When a lease is expired but the close does not occur immediately
> >     the server still SHOULD perform the equivalent of a LAYOUTCOMMIT
> >     at the point of lease expiration.
> >
> >     In many applications, there are files that essentially are never
> >     closed.  In order to provide more meaningful attribute values in
> >     such environments, the server SHOULD ensure that the propagation
> >     of attributes (size, change, etc.) is done at least every lease
> >     period.  Because attribute updates for many file from a single
DS
> >     could be gathered in a single request, the resources to do this
> >     not very large.
> >
> >  Comments?
> >
> > _______________________________________________
> > nfsv4 mailing list
> > nfsv4@ietf.org
> > https://www.ietf.org/mailman/listinfo/nfsv4