Re: [nfsv4] I-D Action: draft-ietf-nfsv4-flex-files-10.txt

Thomas Haynes <loghyr@primarydata.com> Tue, 18 July 2017 08:27 UTC

Return-Path: <loghyr@primarydata.com>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1BECD120227 for <nfsv4@ietfa.amsl.com>; Tue, 18 Jul 2017 01:27:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.491
X-Spam-Level:
X-Spam-Status: No, score=-2.491 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, RCVD_IN_DNSWL_LOW=-0.7, SPF_PASS=-0.001, T_DKIM_INVALID=0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=fail (1024-bit key) reason="fail (message has been altered)" header.d=primarydata.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id luZ5m550d4lS for <nfsv4@ietfa.amsl.com>; Tue, 18 Jul 2017 01:27:28 -0700 (PDT)
Received: from us-smtp-delivery-194.mimecast.com (us-smtp-delivery-194.mimecast.com [63.128.21.194]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id E201713167B for <nfsv4@ietf.org>; Tue, 18 Jul 2017 01:27:27 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=PrimaryData.onmicrosoft.com; s=selector1-primarydata-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=DI3Or+f7Cg99TAc0LVTDUFiD9TmU+1WkQ/HdKGNI6ws=; b=I+xpxho+0G4n/6OTqgO2NejZFpSF7Gtc95jXulY1EPHCIrtCrrsWa1j1l4s+vl/N5JQhedeDUcThYKK/687NbpVA4cdbP5DdEuu4J+0RWtYjN81rRTafv+zUys0aHcnyzNvttTDvl6QTONvminC2nCvEwrL7ZGgMJBYpYK5cI6I=
Received: from NAM02-BL2-obe.outbound.protection.outlook.com (mail-bl2nam02lp0080.outbound.protection.outlook.com [207.46.163.80]) (Using TLS) by us-smtp-1.mimecast.com with ESMTP id us-mta-6-VR0ptlLTPg6rbp4XDsqzOA-1; Tue, 18 Jul 2017 04:27:24 -0400
Received: from BY2PR1101MB1093.namprd11.prod.outlook.com (10.164.166.21) by BY2PR1101MB1096.namprd11.prod.outlook.com (10.164.166.24) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.1261.13; Tue, 18 Jul 2017 08:27:21 +0000
Received: from BY2PR1101MB1093.namprd11.prod.outlook.com ([10.164.166.21]) by BY2PR1101MB1093.namprd11.prod.outlook.com ([10.164.166.21]) with mapi id 15.01.1261.022; Tue, 18 Jul 2017 08:27:21 +0000
From: Thomas Haynes <loghyr@primarydata.com>
To: Rick Macklem <rmacklem@uoguelph.ca>
CC: "nfsv4@ietf.org" <nfsv4@ietf.org>
Thread-Topic: [nfsv4] I-D Action: draft-ietf-nfsv4-flex-files-10.txt
Thread-Index: AQHS/0IpXiE4y7U2CUW8A15IRLPoMqJYh/KAgAAUsYCAAKQwAA==
Date: Tue, 18 Jul 2017 08:27:21 +0000
Message-ID: <8B8F4E97-9DB8-4383-96ED-3A99DAE1049C@primarydata.com>
References: <150032626766.24491.10112068414343721839@ietfa.amsl.com> <626B1EB5-D799-4E29-ADC4-A55682124622@primarydata.com> <YTXPR01MB018976132160BCC0FD3FFDBEDDA00@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM>
In-Reply-To: <YTXPR01MB018976132160BCC0FD3FFDBEDDA00@YTXPR01MB0189.CANPRD01.PROD.OUTLOOK.COM>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [2001:67c:370:128:c9f0:8d97:a2a4:4e70]
x-ms-publictraffictype: Email
x-microsoft-exchange-diagnostics: 1; BY2PR1101MB1096; 20:C7bDFZ+fbt/yu0/wWKfgLLH0ggqWXcxAGbd2osQw+UZYVZFrRgmyc7Y5b+nbmhb8sDCxxW9GYq+EX+Tph3pqCwvbdZEpmtHnTQMoCvZuw4PBUOFZTPRX+9v8jbI0D9UfEYY68T5sK5JQGo53zEwtzLa7uvU0+xySMcWSsIRtzks=
x-ms-office365-filtering-correlation-id: 5ed60a86-0145-452a-2819-08d4cdb6cf68
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(300000500095)(300135000095)(300000501095)(300135300095)(22001)(300000502095)(300135100095)(2017030254075)(300000503095)(300135400095)(2017052603031)(201703131423075)(201702281549075)(300000504095)(300135200095)(300000505095)(300135600095)(300000506095)(300135500095); SRVR:BY2PR1101MB1096;
x-ms-traffictypediagnostic: BY2PR1101MB1096:
x-exchange-antispam-report-test: UriScan:(278178393323532)(158342451672863)(236129657087228)(192374486261705)(209349559609743)(247924648384137);
x-microsoft-antispam-prvs: <BY2PR1101MB10963F14EAD44E4264C44DCDCEA10@BY2PR1101MB1096.namprd11.prod.outlook.com>
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(100000700101)(100105000095)(100000701101)(100105300095)(100000702101)(100105100095)(6040450)(601004)(2401047)(8121501046)(5005006)(2017060910075)(100000703101)(100105400095)(10201501046)(93006095)(93001095)(3002001)(6041248)(20161123562025)(20161123555025)(201703131423075)(201702281528075)(201703061421075)(201703061406153)(2016111802025)(20161123558100)(20161123560025)(20161123564025)(6072148)(6043046)(100000704101)(100105200095)(100000705101)(100105500095); SRVR:BY2PR1101MB1096; BCL:0; PCL:0; RULEID:(100000800101)(100110000095)(100000801101)(100110300095)(100000802101)(100110100095)(100000803101)(100110400095)(100000804101)(100110200095)(100000805101)(100110500095); SRVR:BY2PR1101MB1096;
x-forefront-prvs: 037291602B
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(6009001)(39400400002)(39830400002)(39410400002)(39450400003)(377454003)(24454002)(102836003)(2906002)(53936002)(6116002)(81166006)(53546010)(8936002)(8676002)(50986999)(76176999)(54356999)(33656002)(3660700001)(3280700002)(189998001)(99286003)(6512007)(6436002)(36756003)(6506006)(82746002)(86362001)(478600001)(229853002)(4326008)(77096006)(6486002)(305945005)(230783001)(25786009)(7736002)(5660300001)(2900100001)(6246003)(110136004)(14454004)(83716003)(38730400002)(2950100002)(6916009)(42262002); DIR:OUT; SFP:1102; SCL:1; SRVR:BY2PR1101MB1096; H:BY2PR1101MB1093.namprd11.prod.outlook.com; FPR:; SPF:None; MLV:sfv; LANG:en;
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-ID: <344AC876E342734580E72A1ED38276BE@namprd11.prod.outlook.com>
MIME-Version: 1.0
X-OriginatorOrg: primarydata.com
X-MS-Exchange-CrossTenant-originalarrivaltime: 18 Jul 2017 08:27:21.0409 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 03193ed6-8726-4bb3-a832-18ab0d28adb7
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BY2PR1101MB1096
X-MC-Unique: VR0ptlLTPg6rbp4XDsqzOA-1
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: base64
Archived-At: <https://mailarchive.ietf.org/arch/msg/nfsv4/kafZHg3TNj6au3DnXr-cbI74Wpk>
Subject: Re: [nfsv4] I-D Action: draft-ietf-nfsv4-flex-files-10.txt
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 18 Jul 2017 08:27:30 -0000

> On Jul 18, 2017, at 12:39 AM, Rick Macklem <rmacklem@uoguelph.ca> wrote:
> 
> First off. I'd like to thank both Thomas and David for their work on this. I am happy to see
> active work on it.
> 
> 1 - I think I mentioned this, but I no longer care about the stuff I posted last year.
>      (Basically the case where a server might only require one mirror to be written and
>       how quickly changes to a mirror were to be propagated. Others may still feel that
>       this is useful, but it no longer matters to me;-)
> 
> I am thinking of posting one issue at a time, since they probably require discussion.
> (If you'd prefer one post with everything in it, just let me know and I'll hold off on
> further individual posts and make one "collective" post.)
> Having said the above, here's the first one:
> 
>> 2.1.  LAYOUTCOMMIT
>> 
>>  When tightly coupled storage devices are used, the metadata server
>>  has the responsibility, upon receiving a LAYOUTCOMMIT (see
>>  Section 18.42 of [RFC5661]), of ensuring that the semantics of pNFS
>>  are respected (see Section 12.5.4 of [RFC5661]).  These do not
>>  include a requirement that data written to data storage device be
>>  stable upon completion of the LAYOUTCOMMIT.
> 
> - The above sentence seems to imply that the client is responsible to do COMMITs

The client is always responsible for doing COMMITs.

In a loosely coupled storage device, the client is responsible for doing the COMMIT because it has the knowledge of which write to commit.

With a tightly coupled storage device, if the client sends the COMMIT to the MDS, the MDS can ensure through the control protocol that the commit happens.

Actually, that is wrong, RFC 5661 states in Section 12.5.4 (on page 289): 

   The data should be written and
   committed to the appropriate storage devices before the LAYOUTCOMMIT
   occurs. 

And that completely invalidates the last sentence of the first paragraph of Section 2.1.

(see below)


>   and seems to contradict the following para., which suggests that the COMMITs
>   are required for "loose coupling" only.
> 
>> In the case of loosely coupled storage devices, it is the
>> responsibility of the client to make sure the data file is stable
>> before the metadata server begins to query the storage devices about
>> the changes to the file.  If any WRITE to a storage device did not
>> result with stable_how equal to FILE_SYNC, a LAYOUTCOMMIT to the
>> metadata server MUST be preceded by a COMMIT to the storage devices
>> written to.  Note that if the client has not done a COMMIT to the
>>  storage device, then the LAYOUTCOMMIT might not be synchronized to
>>  the last WRITE operation to the storage device.
> - I think it would be simpler for a client implementor (like me;-) to just apply
>  this para. to both loose and tight coupling (ie. just drop the part of the first
>  sentence up to the "," and say "It is the responsibility of the client..”


It is safe to say/implement that the client does the COMMIT in either scenario.

Taking both of these points into consideration, I’ve rewritten the section as:

2.1.  LAYOUTCOMMIT

   The metadata server has the responsibility, upon receiving a
   LAYOUTCOMMIT (see Section 18.42 of [RFC5661]), of ensuring that the
   semantics of pNFS are respected (see Section 12.5.4 of [RFC5661]).
   These do include a requirement that data written to data storage
   device be stable before the occurance of the LAYOUTCOMMIT.

   It is the responsibility of the client to make sure the data file is
   stable before the metadata server begins to query the storage devices
   about the changes to the file.  If any WRITE to a storage device did
   not result with stable_how equal to FILE_SYNC, a LAYOUTCOMMIT to the
   metadata server MUST be preceded by a COMMIT to the storage devices
   written to.  Note that if the client has not done a COMMIT to the
   storage device, then the LAYOUTCOMMIT might not be synchronized to
   the last WRITE operation to the storage device.

I.e., no mention of the model as they both have to obey the same requirements.


> OR
> - Make it conditional on a flag returned by the MDS.
> 
> In general, most of the "tight" vs "loose" coupling seems to be confusing to me.
> I can see that there may be the two cases w.r.t. security:
> 1 - Use the synthetic uid/gid and AUTH_SYS only.
> OR
> 2 - Use the same model as the MDS (same uid/gid/authenticator) and let the
>     MDS<->DS figure out how to implement it.
> Again, I think a flag returned by the MDS to indicate which security model is being
> used would be simpler than the "tight" vs "loose" concept. (The document can mention
> that some control protocol may be needed for the server implementation, but the
> client only cares what it needs to do and not how the server chooses to implement it.)


The concepts are presented to highlight the difficulties that a server developer will face.

I don’t think the client has to be aware of whether it is loosely or rightly coupled.

Consider a NFSv3 storage device made by X. A MDS made by Y can use NFSv3 as
 the control protocol. It could CREATE to instantiate the data files, it could SETATTR
the mode bits/uid/gid to fence, it can GETATTR to determine the change attribute,
the mtime, etc.

But if Y can recognize X and X provides some management APIs, then Y might be
able to leverage those APIs as an additional control protocol.

And as a client implementor, you are completely unaware of all of that.



> 
> I just find the "loose" vs "tight" coupling a somewhat confusing "catch all" and I think
> a few flags returned by the MDS to the client will clarify what the client needs to do,
> but this is just a suggestion and I'm happy with anything that works. (I think this also
> applies to Data Server vs Storage Server. To be honest, I'd just get rid of all mention
> of these and just call them all data servers?)
> 

In practice, they are the same. In the original 3 standard documents, they are
treated a bit differently. A data server has an OS, a storage device may or may not
have an OS.


> Again, thanks for working on this, rick
> 

And thanks for making me think!