Re: [nfsv4] AD Evaluation for draft-ietf-nfsv4-flex-files-14.txt

Thomas Haynes <loghyr@primarydata.com> Tue, 07 November 2017 21:42 UTC

Return-Path: <loghyr@primarydata.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 88AB1126C19 for <nfsv4@ietfa.amsl.com>; Tue, 7 Nov 2017 13:42:16 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.699
X-Spam-Level:
X-Spam-Status: No, score=-2.699 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=primarydata.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ReXPgxx4vArD for <nfsv4@ietfa.amsl.com>; Tue, 7 Nov 2017 13:42:13 -0800 (PST)
Received: from us-smtp-delivery-194.mimecast.com (us-smtp-delivery-194.mimecast.com [216.205.24.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B71E11279E5 for <nfsv4@ietf.org>; Tue, 7 Nov 2017 13:42:10 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=primarydata.com; s=mimecast20170802; t=1510090929; h=from:subject:date:message-id:to:cc:mime-version:content-type:in-reply-to:references; bh=QauQwLqwqGGf8hIGXXzHuaihhNHbGXpB6gdhL2PD/2k=; b=Pye8p5J7aU2ZKds+75WfWhpuXtuFHa8+oXK3XN5VnYbYhM4pi21BTvZvQfbnNjdq/t6H0hT2KLn5GkMebR6qu67558BJA8eyg/4Aj/fgKFlfXb7HiijED/e8uKEdf1/Sl8ujMl5JQXrZnNuTHGBJM5gNrB6z2/b0GgRS334LzHY=
Received: from NAM01-BY2-obe.outbound.protection.outlook.com (mail-by2nam01lp0178.outbound.protection.outlook.com [216.32.181.178]) (Using TLS) by us-smtp-1.mimecast.com with ESMTP id us-mta-68-mO6jDMjpMlGiXHlGMfh0hA-1; Tue, 07 Nov 2017 16:41:00 -0500
Received: from BY2PR1101MB1093.namprd11.prod.outlook.com (10.164.166.21) by BY2PR1101MB1096.namprd11.prod.outlook.com (10.164.166.24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P256) id 15.20.197.13; Tue, 7 Nov 2017 21:40:54 +0000
Received: from BY2PR1101MB1093.namprd11.prod.outlook.com ([10.164.166.21]) by BY2PR1101MB1093.namprd11.prod.outlook.com ([10.164.166.21]) with mapi id 15.20.0197.020; Tue, 7 Nov 2017 21:40:54 +0000
From: Thomas Haynes <loghyr@primarydata.com>
To: Spencer Dawkins <spencerdawkins.ietf@gmail.com>
CC: "draft-ietf-nfsv4-flex-files@ietf.org" <draft-ietf-nfsv4-flex-files@ietf.org>, "nfsv4-ads@ietf.org" <nfsv4-ads@ietf.org>, "nfsv4@ietf.org" <nfsv4@ietf.org>
Thread-Topic: AD Evaluation for draft-ietf-nfsv4-flex-files-14.txt
Thread-Index: AQHTWACSbT5iwbO2NkKG3XEnui+F6KMJcikA
Date: Tue, 07 Nov 2017 21:40:53 +0000
Message-ID: <BC4F0C02-528E-4ECB-8937-194D25721DBE@primarydata.com>
References: <CAKKJt-cJTyCDtptO8NvZXznQ31O5kNn4ffh9TVgNBtioFRw7yw@mail.gmail.com>
In-Reply-To: <CAKKJt-cJTyCDtptO8NvZXznQ31O5kNn4ffh9TVgNBtioFRw7yw@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [63.157.6.18]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; BY2PR1101MB1096; 20:CDc7DBDtNO/jptIfT3OxmtrNKk+0QQjei91I6CS0WY8X9bMJvvCVoLhSNMX7k5VtGtEqvJcyCtM+bYT3/yvhqCjibgbPvp8Jfe1PQclFY5iLr/GdswAxXf/hxkhz74HvLM3CzRYrhJ0EYQ1lxBj4I98Fz9w37EHxxnRHMjKUjj0=
x-ms-exchange-antispam-srfa-diagnostics: SSOS;
x-ms-office365-filtering-correlation-id: a64d1523-0cff-434d-bc8b-08d526283909
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(4534020)(4602075)(4603075)(4627115)(201702281549075)(2017052603249); SRVR:BY2PR1101MB1096;
x-ms-traffictypediagnostic: BY2PR1101MB1096:
x-exchange-antispam-report-test: UriScan:(158342451672863)(120809045254105)(788757137089)(1591387915157);
x-microsoft-antispam-prvs: <BY2PR1101MB109671C598292655E90676A1CE510@BY2PR1101MB1096.namprd11.prod.outlook.com>
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(2401047)(8121501046)(5005006)(3002001)(10201501046)(3231021)(100000703101)(100105400095)(93006095)(93001095)(6041248)(20161123564025)(20161123558100)(2016111802025)(20161123555025)(20161123562025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(20161123560025)(6043046)(6072148)(201708071742011)(100000704101)(100105200095)(100000705101)(100105500095); SRVR:BY2PR1101MB1096; BCL:0; PCL:0; RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095); SRVR:BY2PR1101MB1096;
x-forefront-prvs: 0484063412
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(376002)(346002)(39830400002)(51914003)(24454002)(189002)(61484003)(199003)(7736002)(2900100001)(236005)(8676002)(229853002)(68736007)(3280700002)(6916009)(2950100002)(77096006)(54906003)(76534003)(76176999)(230783001)(36756003)(4326008)(53936002)(8936002)(3660700001)(6436002)(6512007)(6306002)(316002)(606006)(50986999)(66066001)(54356999)(54896002)(81166006)(86362001)(99286004)(97736004)(81156014)(6246003)(83716003)(14454004)(966005)(478600001)(106356001)(6506006)(82746002)(53546010)(39060400002)(33656002)(6486002)(25786009)(2906002)(101416001)(5660300001)(189998001)(102836003)(6116002)(3846002)(105586002)(42262002); DIR:OUT; SFP:1102; SCL:1; SRVR:BY2PR1101MB1096; H:BY2PR1101MB1093.namprd11.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en;
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
MIME-Version: 1.0
X-OriginatorOrg: primarydata.com
X-MS-Exchange-CrossTenant-Network-Message-Id: a64d1523-0cff-434d-bc8b-08d526283909
X-MS-Exchange-CrossTenant-originalarrivaltime: 07 Nov 2017 21:40:53.8717 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 03193ed6-8726-4bb3-a832-18ab0d28adb7
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY2PR1101MB1096
X-MC-Unique: mO6jDMjpMlGiXHlGMfh0hA-1
Content-Type: multipart/alternative; boundary="_000_BC4F0C02528E4ECB8937194D25721DBEprimarydatacom_"
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/Zl3J7ekNA5QBWxsHucV3W4tYv0c>
Subject: Re: [nfsv4] AD Evaluation for draft-ietf-nfsv4-flex-files-14.txt
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Nov 2017 21:42:16 -0000

Hi Spencer,

Thanks for the review - I had to think and organize a response on what you were asking.

Tom

On Nov 7, 2017, at 11:42 AM, Spencer Dawkins at IETF <spencerdawkins.ietf@gmail.com<mailto:spencerdawkins.ietf@gmail.com>> wrote:

Hi, authors,

I apologize for not jumping on this draft when publication was requested - it shows up with that status in the datatracker, but a quick search of my e-mail archive doesn't show the corresponding notification e-mail to me. But, having noticed that I should be reading it ...

I have some questions, which I'd like to work through before requesting Last Call. Most are harmless, and a few are requirements language questions, but I really need to understand the IANA Considerations section note about Table 3.

Just administratively, I know that automatic I-D submissions are closed until next Monday, and NFSv4 isn't meeting at IETF 100, so I'd expect to wait until after IETF 100 to request Last Call (since we typically extend Last Calls that overlap meeting weeks anyway). If the chairs would like for me to handle this in another way, please let me know.

As an author, I am fine with that constraint.


And thanks for your work on this.

Spencer (D)

--

I think

   This document does
   not provide a standard's track control protocol.

should be "standard track control protocol”.

Ack


I think

   The requirements for the a
   control protocol are specified in [RFC5661] and clarified in
   [pNFSLayouts].

should either be "for a control protocol" or "for the control protocol", but not both.

Ack



I don't understand what "need not be a protocol as normally understood" in

     control protocol:  is the particular mechanism that an implementation
      of a layout type would use to meet the control communication
      requirement for that layout type.  This need not be a protocol as
      normally understood.  In some cases the same protocol may be used
      as a control protocol and storage protocol.

means. You do expand the definition in the next sentence, but only explaining something about a protocol as I understand protocols. Could you also add an example of the kind of thing you're thinking of, as "not a protocol”?

I thought it did :-)

How about:

   This need not be a protocol dedicated to the control communication. For
   example, the control protocol could be defined over a storage protocol.

If this still isn’t clear (which I admit may be the case), then we will have to point to a specific example.



Is it possible to add a sentence or two in Section 2, that explains the tradeoffs between loose and tight coupling? I understand there are two alternatives (and the differences are described throughout the doc), but don't understand at a high level how I'd decide which one to implement, if I was starting from scratch.

How about:

While tight coupling can provide a better integration
of the metadata server to the storage devices, it
requires either a standard track control protocol
or that a vendor implements both the metadata
server and the storage devices. In turn, the loose
coupling allows the flexibility of utilizing any storage
device which support the storage protocol.



I'm not a genius of "code in drafts", but shouldn't

  /// /*
   ///  * Copyright (c) 2012 IETF Trust and the persons identified
   ///  * as authors of the code. All rights reserved.
   ///  *

be 2017? The template in [LEGAL] is "insert current year" …

Lol - is it current year of the current draft or current year as of when we started?



I'm seeing major and minor version numbers, but I'm not seeing mention of https://datatracker.ietf.org/doc/rfc8178/ -style extensions. Does that matter? Or would support for extensions simply be detected in the way described at the end of https://tools.ietf.org/html/rfc8178#section-4.3?


Ahh, I was going to suggest another paragraph when I realized that this document
is built on top of the pNFS Layout Types Registry of RFC5661 (see Section 16)
and as such does not present any extensions along the lines of RFC8178.

I.e., we allow for a NFS4.1 implementation of the flex files layout type due to RFC5661,
while RFC8178 would disallow extensions to NFSv4.1.



I'm fine with all of the SHOULDs and SHOULD NOTs in

  The client is free to use any of the network addresses as a
   destination to send storage device requests.  If some network
   addresses are less desirable paths to the data than others, then the
   MDS SHOULD NOT include those network addresses in ffda_netaddrs.  If
   less desirable network addresses exist to provide failover, the
   RECOMMENDED method to offer the addresses is to provide them in a
   replacement device-ID-to-device-address mapping, or a replacement
   device ID.  When a client finds no response from the storage device
   using all addresses available in ffda_netaddrs, it SHOULD send a
   GETDEVICEINFO to attempt to replace the existing device-ID-to-device-
   address mappings.  If the MDS detects that all network paths
   represented by ffda_netaddrs are unavailable, the MDS SHOULD send a
   CB_NOTIFY_DEVICEID (if the client has indicated it wants device ID
   notifications for changed device IDs) to change the device-ID-to-
   device-address mappings to the available addresses.  If the device ID
   itself will be replaced, the MDS SHOULD recall all layouts with the
   device ID, and thus force the client to get new layouts and device ID
   mappings via LAYOUTGET and GETDEVICEINFO.

except the last one. Can you help me understand why the MDS would not recall all layouts and force clients to get new layouts and mappings?


Argh, should not have MDS in the first place!

Should be metadata server.

The metadata server does not need to recall the layouts as the client should
check any layout stateids it has with the device ID which is being replaced.

https://tools.ietf.org/html/rfc5661#section-18.40.4

   o  CB_NOTIFY_DEVICEID deletes a device ID.  If the client believes it
      has layouts that refer to the device ID, then it is possible that
      layouts referring to the deleted device ID have been revoked.  The
      client should send a TEST_STATEID request using the stateid for
      each layout that might have been revoked.

Although in contradiction: https://tools.ietf.org/html/rfc5661#section-20.12.3

   NOTIFY4_DEVICEID_DELETE
      Deletes a device ID from the mappings.  This notification MUST NOT
      be sent if the client has a layout that refers to the device ID.
      In other words, if the server is sending a delete device ID
      notification, one of the following is true for layouts associated
      with the layout type:

      *  The client never had a layout referring to that device ID.

      *  The client has returned all layouts referring to that device
         ID.

      *  The server has revoked all layouts referring to that device ID.

      The notification is encoded in a value of data type
      notify_deviceid_delete4.  After a server deletes a device ID, it
      MUST NOT reuse that device ID for the same layout type until the
      client ID is deleted.

Which implies that the SHOULD is a MUST, except that the client
can deal with it via TEST_STATEID.

Anyone want to chime in?


Is it correct that there's no difference between the client's action when receiving

  NFS4ERR_LAYOUTTRYLATER:  there is some issue preventing the layout
      from being granted.  If the client already has an appropriate
      layout, it should continue with I/O to the storage devices.

and

   NFS4ERR_DELAY:  there is some issue preventing the layout from being
      granted.  If the client already has an appropriate layout, it
      should not continue with I/O to the storage devices.

? That wouldn't surprise me, but I thought I should ask.

Yes, there is a difference. Look at what is recommended if the client already has an appropriate layout.



I think

  In [RFC5661], the file layout type is defined such that the
   relationship between multipathing and filehandles can result in
   either 0, 1, or N filehandles (see Section 13.3).  Some rationals for
   this are clustered servers which share the same filehandle or
   allowing for multiple read-only copies of the file on the same
   storage device.

should be "rationales”.

Ack


Could you help me understand why

  The metadata server SHOULD recall any outstanding layouts to allow it
   exclusive write access to the stripes being recovered and to prevent
   other clients from hitting the same error condition.  In these cases,
   the server MUST complete recovery before handing out any new layouts
   to the affected byte ranges.

is not a MUST? ("Why wouldn't the MDS do that?”)

Ack.

Although what might have been meant here is that :

  The metadata server MUST recall any outstanding layout, on those segments
  which are being recovered, to allow it …

I.e., the SHOULD allowed non-impacted layout segments to not be recalled.


Is "hot" a term of art for the NFS community? (If so, is there a reference with a definition that could be included on first use?)

Not sure I would say it is a term of art for NFS, but perhaps one for computing in general.

And unfortunately, a search on "hot computer performance” or "hot spot computer performance”, oh
wait, that is it:

https://en.wikipedia.org/wiki/Hot_spot_(computer_programming)

But not so sure I am going to reference WikiPedia here...


I'm really confused by

  Note, [RFC5661] should have also defined (see Table 3):

   +-------------------------------+------+-----------+-----+----------+
   | Recallable Object Type Name   | Valu | RFC       | How | Minor    |
   |                               | e    |           |     | Versions |
   +-------------------------------+------+-----------+-----+----------+
   | RCA4_TYPE_MASK_OTHER_LAYOUT_M | 12   | [RFC5661] | L   | 1        |
   | IN                            |      |           |     |          |
   | RCA4_TYPE_MASK_OTHER_LAYOUT_M | 15   | [RFC5661] | L   | 1        |
   | AX                            |      |           |     |          |
   +-------------------------------+------+-----------+-----+----------+

              Table 3: Recallable Object Type Assignments

As best I can tell, https://www.iana.org/assignments/nfsv4-recallable-object-types/nfsv4-recallable-object-types.xhtml#nfsv4-recallable-object-types-1 defines them as

RCA4_TYPE_MASK_OBJ_LAYOUT_MIN 8 [RFC5661] L 1
RCA4_TYPE_MASK_OBJ_LAYOUT_MAX 9 [RFC5661] L 1

so, the values don't match. What am I missing? Is this a request to IANA to change the assigned values, or something else?

No, you have the wrong values. Note in https://tools.ietf.org/html/rfc5661#section-20.6.1 :


   const RCA4_TYPE_MASK_RDATA_DLG          = 0;
   const RCA4_TYPE_MASK_WDATA_DLG          = 1;
   const RCA4_TYPE_MASK_DIR_DLG            = 2;
   const RCA4_TYPE_MASK_FILE_LAYOUT        = 3;
   const RCA4_TYPE_MASK_BLK_LAYOUT         = 4;
   const RCA4_TYPE_MASK_OBJ_LAYOUT_MIN     = 8;
   const RCA4_TYPE_MASK_OBJ_LAYOUT_MAX     = 9;
   const RCA4_TYPE_MASK_OTHER_LAYOUT_MIN   = 12;
   const RCA4_TYPE_MASK_OTHER_LAYOUT_MAX   = 15;


This section simply points out that RFC5661 should have defined these values in
https://tools.ietf.org/html/rfc5661#section-22.3.1

They might be covered as reserved for private use in
https://www.iana.org/assignments/nfsv4-recallable-object-types/nfsv4-recallable-object-types.xhtml#nfsv4-recallable-object-types-1.
But that should really be called out.

Although perhaps this (in https://tools.ietf.org/html/rfc5661#section-20.6.3) covers that:

   RCA4_TYPE_MASK_OTHER_LAYOUT_MIN to RCA4_TYPE_MASK_OTHER_LAYOUT_MAX

      This range is reserved for telling the client to recall layouts of
      experimental or site-specific layout types (see Section 3.3.13).

I found it all confusing and I’m willing to drop it if you believe it is covered in RFC5661.