Re: [nfsv4] (belated) copy of my review of draft-ietf-nfsv4-layout-types-03

Thomas Haynes <loghyr@primarydata.com> Wed, 19 July 2017 09:27 UTC

Return-Path: <loghyr@primarydata.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 44CF0131C5F for <nfsv4@ietfa.amsl.com>; Wed, 19 Jul 2017 02:27:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.489
X-Spam-Level:
X-Spam-Status: No, score=-2.489 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, HTML_FONT_SIZE_LARGE=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, T_DKIM_INVALID=0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=fail (1024-bit key) reason="fail (body has been altered)" header.d=primarydata.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tvkBwsBI3Qaw for <nfsv4@ietfa.amsl.com>; Wed, 19 Jul 2017 02:27:33 -0700 (PDT)
Received: from us-smtp-delivery-194.mimecast.com (us-smtp-delivery-194.mimecast.com [216.205.24.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id BF275131C53 for <nfsv4@ietf.org>; Wed, 19 Jul 2017 02:27:32 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=PrimaryData.onmicrosoft.com; s=selector1-primarydata-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=5CnUbFQ2aH2lxuj0tJtuWh+Xcq9NoOW/PzSDJvjGTkY=; b=LHc/TcYLN4QeDAgTcceT2cm+af2f00qfPRVPbs2q+x4JhGGQOmXFfKLHLzvL6ENUDFm+RmDG0JT5LSbrRq3Z0V/n1PbZB/Pge9Ux89ZC1hhcLc0p8p4w/DDz+l+7X84qP16WhcXtDpZzJVs3VoauhIGHZQVKvAvD1L+FP+9Up2A=
Received: from NAM02-BL2-obe.outbound.protection.outlook.com (mail-bl2nam02lp0082.outbound.protection.outlook.com [207.46.163.82]) (Using TLS) by us-smtp-1.mimecast.com with ESMTP id us-mta-91-7ZE8UK1zMR-sZFqsjdkgFw-1; Wed, 19 Jul 2017 05:27:28 -0400
Received: from BY2PR1101MB1093.namprd11.prod.outlook.com (10.164.166.21) by BY2PR1101MB1094.namprd11.prod.outlook.com (10.164.166.22) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1261.13; Wed, 19 Jul 2017 09:27:25 +0000
Received: from BY2PR1101MB1093.namprd11.prod.outlook.com ([10.164.166.21]) by BY2PR1101MB1093.namprd11.prod.outlook.com ([10.164.166.21]) with mapi id 15.01.1261.024; Wed, 19 Jul 2017 09:27:25 +0000
From: Thomas Haynes <loghyr@primarydata.com>
To: Dave Noveck <davenoveck@gmail.com>
CC: "nfsv4@ietf.org" <nfsv4@ietf.org>
Thread-Topic: [nfsv4] (belated) copy of my review of draft-ietf-nfsv4-layout-types-03
Thread-Index: AQHTAHE7A64s4ZJ27kezgEbNs7kVzQ==
Date: Wed, 19 Jul 2017 09:27:25 +0000
Message-ID: <323DBF91-FEBC-4323-A45E-AAA6ADCEA1C4@primarydata.com>
References: <CADaq8jc5ZkHbF=d8-6P+3hU0Y8YeXH_vRBK1RyepAK=vo4g93Q@mail.gmail.com>
In-Reply-To: <CADaq8jc5ZkHbF=d8-6P+3hU0Y8YeXH_vRBK1RyepAK=vo4g93Q@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [2001:67c:370:128:14ca:25a2:43f7:4347]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; BY2PR1101MB1094; 20:Yt+ZYH43O3Qk8K5Dx5iOAv9I9AkYWxOYYyjqjpBY+/E+B5Zj50qeJwizfMUOR6gDoVh7CVPaoERoqe0y0Ej8R3Hgw2qHl37mtmzqWD4tFLJDGKfHeo6o7F3/KEWYUjiyMqWVGXEo8710BJJQfbF1N0ALjMdy0DaG8yheC+TbAHg=
x-ms-office365-filtering-correlation-id: c69c7635-e308-4e20-195a-08d4ce885e05
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254075)(300000503095)(300135400095)(2017052603031)(201703131423075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095); SRVR:BY2PR1101MB1094;
x-ms-traffictypediagnostic: BY2PR1101MB1094:
x-exchange-antispam-report-test: UriScan:(278178393323532)(158342451672863)(133145235818549)(236129657087228)(788757137089)(48057245064654)(148574349560750)(209349559609743)(247924648384137);
x-microsoft-antispam-prvs: <BY2PR1101MB10943F4C75964FB4331ADEDDCEA60@BY2PR1101MB1094.namprd11.prod.outlook.com>
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(601004)(2401047)(8121501046)(2017060910075)(5005006)(10201501046)(100000703101)(100105400095)(3002001)(93006095)(93001095)(6041248)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123562025)(20161123560025)(2016111802025)(20161123564025)(20161123558100)(20161123555025)(6072148)(6043046)(100000704101)(100105200095)(100000705101)(100105500095); SRVR:BY2PR1101MB1094; BCL:0; PCL:0; RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095); SRVR:BY2PR1101MB1094;
x-forefront-prvs: 0373D94D15
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(39410400002)(39400400002)(39830400002)(39450400003)(57704003)(24454002)(377454003)(6436002)(50986999)(54356999)(76176999)(36756003)(33656002)(7736002)(25786009)(102836003)(189998001)(2900100001)(38730400002)(14454004)(110136004)(6116002)(1411001)(6246003)(230783001)(345774005)(39060400002)(478600001)(99286003)(54896002)(4326008)(8676002)(2950100002)(53946003)(8936002)(53546010)(86362001)(6512007)(6916009)(77096006)(53936002)(229853002)(236005)(82746002)(81166006)(2906002)(5660300001)(3660700001)(83716003)(3280700002)(6486002)(6506006)(42262002)(579004); DIR:OUT; SFP:1102; SCL:1; SRVR:BY2PR1101MB1094; H:BY2PR1101MB1093.namprd11.prod.outlook.com; FPR:; SPF:None; MLV:ovrnspm; PTR:InfoNoRecords; LANG:en;
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
MIME-Version: 1.0
X-OriginatorOrg: primarydata.com
X-MS-Exchange-CrossTenant-originalarrivaltime: 19 Jul 2017 09:27:25.1849 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 03193ed6-8726-4bb3-a832-18ab0d28adb7
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY2PR1101MB1094
X-MC-Unique: 7ZE8UK1zMR-sZFqsjdkgFw-1
Content-Type: multipart/alternative; boundary="_000_323DBF91FEBC4323A45EAAA6ADCEA1C4primarydatacom_"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/BU3HeSU_GlqFBQvSPaf6k4QkcD8>
Subject: Re: [nfsv4] (belated) copy of my review of draft-ietf-nfsv4-layout-types-03
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 19 Jul 2017 09:27:41 -0000

Hi Dave,

I’ve evidently already applied some of these in my draft-05 working
copy. As such, I’ll either note I’ve made the change at this time or
ack that I made it at some earlier point.

I’m going to split my replies to avoid getting notified by the mailer
that mu message is too large.

Here is part 1:

On Jul 16, 2015, at 4:21 PM, David Noveck <davenoveck@gmail.com<mailto:davenoveck@gmail.com>> wrote:

I sent the comments below to Tom a while back.  This was the review that Tom was referring to at the Dallas meeting.  Although Tom characterized my comments as a "good review", I don't think one can assume that he necessarily agrees with everything in these comments.  Still, it would be good if the working group had access to these comments before discussing next steps for draft-ietf-nfsv4-layout-types.

_______________

Overall Comments

Overall this document has made a good start in addressing the problems it set out to solve.  I'd like to thank you for undertaking what might have seemed a thankless task.  It's now almost thankless but I hope it won't remain so.

I think the problem it deals with is more difficult than I expected it to be.  I hadn't recognized the complexity that derives from:

  *   The fact that we have three components to provide interoperability among, rather than just two.
  *   The fact that we are trying to accommodate such different protocols within the same framework.
  *   The fact that locking and permissions are similar enough that they need to be treated together but different enough that doing so is not easy.

There are a couple of issues that I think would be best addressed relatively soon::

  *   The fact that the phrase "control protocol" is  used to mean a number of different things (i.e, requirements, and also the mechanism to satisfy them) makes things hard to understand.  Phrases like "no control protocol" and "real control protocol" compound the difficulty.
  *   Re-organization of section 3.

As noted above, I think the material in section 3 needs to be re-organized.   Much of it is not really related, or related only tangentially, to the control protocol, at least as I understand the term.  One possible classification of the material is as follows:

  *   Semantic requirements that apply to the protocol as a whole.
  *   Issues that relate to the interaction of the Metadata sever and the data storage devices.  This covers the control program requirements proper, or at least that part of them that is layout-type-specific.
  *   Issues that relate to the interaction of clients and the data storage devices.  This includes the need to clearly define the data storage protocol and to specify who is responsible for making sure that IO requests conform to valid layouts.
  *   Issues that relate to the interaction between the clients and the metadata server.

For all these, it is necessary to distinguish among:

  *   REQUIREMENTS specified in this document that apply directly to pNFS implementations.
  *   requirements for the document defining a layout type to clearly explain/define certain things.
  *   guidance regarding the need for layout type definition documents to impose appropriate REQUIREMENTS on implementations to ensure interoperability.

Comments by Section

Abstract

Suggest replacing the first sentence by the following two sentences:


This document defines the requirements which individual pNFS layout types need to meet in order to work within the parallel NFS (pNFS) framework as defined in RFC5661.  In so doing, it aims to more clearly distinguish between requirements for pNFS as a whole  and those those specifically directed to the pNFS File Layout.


Ack


In the last sentence suggest adding "in this regard" after "RFC5661".

Introduction

In the second paragraph suggest the following changes:

  *   In the first sentence, replace "being strictly for" by "applying only to”.

Ack



  *   In the second sentence, replace "I.e., since" by "Because" and "is some" by "has been some”.

Either done earlier by me or replaced.

In the second sentence of the third paragraph, suggest replacing "clarifies what are" by "specifies”.


Ditto.


Definitions

I'd like to offer the following suggestions for modification/clarification of the definitions. There are also proposed additions. In the case of proposed new definitions, the term to be be defined is underlined

We currently have the issue that it is almost impossible not to use terms in the definitions before they themselves are defined. Like you, I've abused to a degree the principle of alphabetical order so as to limit the difficulties. In addition, given the centrality of the concept of "Layout Type" . I'm proposing bringing it forward in an introductory paragraphs such as the following.

The concept of layout type has a central role in the definition and implementation of parallel NFS. Clients and servers implementing different layout types behave differently in many ways while conforming to the overall parallel NFS framework defined in [RFC5661] and this document. Different layout types may differ in:

  *   The method used to do IO operations directed to data storage devices.
  *   The requirements for communication between the Metadata server and the data storage devices.
  *   the means used to ensure that IO requests are only processed when the client holds an appropriate layout.
  *   The format and interpretation of nominally opaque data fields in pNFS-related NFSv4.x data structures.

Such matters are defined in a standards-track layout type specification. Except for the files layout type, which was defined in chapter 13 of [RFC5661], existing layout types are defined in their own RFC's and it is anticipated that new layout type will be defined in similar documents.


Evidently I added this as an introduction section.



It may be that we'll need a new concept (e.g. "layout-subtype" or "layout type variant") to deal with cases in which a layout type has variant classes of implementations that differs in some but not all of the items listed above. For example, loose and tight coupling variants of flex-files might fit this model and we should consider whether the same would apply for virtualized and non-virtualized variants of the block layout type.

  *   Control communication requirements: I think it is helpful here to separate requirements and mechanism. So i'm proposing:

For a particular layout type, defines the details regarding information (e.g layouts, stateids, file metadata, and file data) which must be  communicated between the metadata server and the storage devices.

  *   Control protocol:

Defines a particular mechanism that an implementation of a layout type would use to meet the control communication requirement for that layout type. This need not be a protocol as normally understood. In some cases the same protocol my be used as a control protocol and data access protocol.

  *   (file) data: suggest the following as a replacement

is that part of the file system object that contains the data to be read or written,  as opposed to attributes of the object .  That is, it is the file contents.

  *   Data server (DS):  I have a suggestion (below) designed to resolve the confusion and apparent self-contradiction regarding this term:

is one of the pNFS servers which provide the contents of a file system object which is a regular file, when the file system object is accessed over a file-based protocol.  Note that this usage differs from that in [RFC5661] which applies the term in some cases even when other sorts of protocols are being used. Depending on the layout, there might be one or more data servers over which the data is striped.  Note that while the metadata server is strictly accessed over the NFSv4.1 protocol, depending on the Layout Type, the data server could be accessed via any file access  protocol that meets the pNFS requirements.


See section 2.1 for a comparison of this term and"data storage device”.


Evidently I had made the above edits and not the below:


  *   fencing:  Suggest replacing "when" by "process by which”

Ack


  *   layout: suggest the following replacement:

contains information a client uses to access file data on pNFS storage device .  This

We can’t say pNFS storage device. I know what you mean here, but I can read it another way.


will include specification of the protocol (layout type) and the identity of the storage devices to be used.



The bulk of the contents of the layout are defined in [RFC5662] as nominally opaque, but individual layout types may specify their own interpretation of layout data.


Edited in...


  *   layout iomode: Suggest replacing "read or" by "read-only access or”

Ack


  *   layout stateid: Suggest replacing "the difference between a layout stateid and a normal stateid" by "differences in handling between layout stateids and other stateids”.

Ack

  *   Layout Type: propose removing this in favor of the introductory paragraph suggested above

I’ve added a pointer because:

A) We should have something here
B) I don’t want to repeat it verbatim.



  *   (file) metadata: Suggest the follow replacement:

is that part of the file system object that contains various descriptive data relevant to the file object, as opposed to the file data itself.  This could include the tie of last modification, access time, eof position, etc.

Ack


  *   Metadata server (MDS):

     *   In the second sentence, suggest replacing "generating" by "generating, recalling, and revoking"
     *   Suggest deleting the third sentence and appending the following material to the end of the second sentence

, for performing directory operations, and for performing I/O operations to regular files when the clients direct these to the MDS itself.

Ack


  *   recalling a layout:
     *   in the first sentence, suggest replacing "is" by "occurs"
     *   also in the first sentence, suggest replacing "uses a back channel" by "issues a callback”.

Ack

And do we need to define a callback?


  *   revoking a layout:  suggest the following replacement:

occurs when the metadata server invalidates a specific layout  Once revocation occurs, the metadata server will not accept as valid any reference to the revoked layout and the data storage device will not accept any client access based on the layout.

Ack

  *   stateid:  suggest the following replacement:

is a 128-bit quantity returned by a server that uniquely defines the some set of locking-related state provided by the server. Stateids may designate state related to open files, to byte-range locks, to delegations, or to layouts.

Ack


  *   storage device: suggest the following as a replacement

Designates the target to which clients may direct IO requests when they hold an appropriate layout. Note that each data server is a data storage device but that some data storage device are not data servers. See section 2.1 for further discussion.

Ack


  *   storage protocol:

is the protocol used by clients to do IO operations to the data storage device  Each layout type may specify  its own data access protocol.  It is possible for a layout type to specify multiple data access protocols

Ack


Section 2.1

First of all, suggest as a new title for this section: Use of the Terms "Data Server" and "Data storage Device"

Proposed new content for this section is below:

In [RFC5661]], these two terms are used somewhat inconsistently:

  *   In chapter 12, where pNFS in general is discussed, the term "data storage device" is used.
  *   In chapter 13, there the file layout type is discussed, the term "data server" is used.
  *   In other chapters, the term data "server: is used, even in contexts where the storage access type is not NFSv4.1 or any other file access protocol.

As this document deals with pNFS in general, it uses the more generic term "storage device" in preference to "data server".  The term "data server" is used only in contexts in which a file server is used as a data storage device.  Note that every data server is a storage device but that storage devices which use protocols which are not file access protocol are not data servers.

Since a given data storage device may support multiple layout types, a given device can potentially act as a data server for some set of storage protocols while simultaneously acting as a non-data-server data storage device for others.

And edited in...



Requirements Language

This document poses special issues regarding the RFC2119 terms in that it states requirements/recommendations that apply both to implementations and to future specifications.   My preference is to apply them only to implementations but I will follow your existing practice in my comments which follow.  Still, whichever way you choose to go on this particular issue, this section should say something about the question.

Thinking cap on