Re: [nfsv4] what does NFSv4.x look like and how is it managed...

<Noveck_David@emc.com> Thu, 25 March 2010 17:56 UTC

Return-Path: <Noveck_David@emc.com>
X-Original-To: nfsv4@core3.amsl.com
Delivered-To: nfsv4@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 7EA663A6800 for <nfsv4@core3.amsl.com>; Thu, 25 Mar 2010 10:56:14 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.469
X-Spam-Level:
X-Spam-Status: No, score=-5.469 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, DNS_FROM_OPENWHOIS=1.13, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id DW0F0h6d+HVP for <nfsv4@core3.amsl.com>; Thu, 25 Mar 2010 10:56:13 -0700 (PDT)
Received: from mexforward.lss.emc.com (mexforward.lss.emc.com [128.222.32.20]) by core3.amsl.com (Postfix) with ESMTP id 1BDE53A6E11 for <nfsv4@ietf.org>; Thu, 25 Mar 2010 10:54:58 -0700 (PDT)
Received: from hop04-l1d11-si04.isus.emc.com (HOP04-L1D11-SI04.isus.emc.com [10.254.111.24]) by mexforward.lss.emc.com (Switch-3.3.2/Switch-3.1.7) with ESMTP id o2PHtKv3017179 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Thu, 25 Mar 2010 13:55:20 -0400
Received: from mailhub.lss.emc.com (numailhub.lss.emc.com [10.254.144.16]) by hop04-l1d11-si04.isus.emc.com (RSA Interceptor); Thu, 25 Mar 2010 13:55:11 -0400
Received: from corpussmtp5.corp.emc.com (corpussmtp5.corp.emc.com [128.221.166.229]) by mailhub.lss.emc.com (Switch-3.4.2/Switch-3.3.2mp) with ESMTP id o2PHsupD016521; Thu, 25 Mar 2010 13:55:11 -0400
Received: from CORPUSMX50A.corp.emc.com ([128.221.62.39]) by corpussmtp5.corp.emc.com with Microsoft SMTPSVC(6.0.3790.3959); Thu, 25 Mar 2010 13:54:44 -0400
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Date: Thu, 25 Mar 2010 13:54:44 -0400
Message-ID: <BF3BB6D12298F54B89C8DCC1E4073D80137046@CORPUSMX50A.corp.emc.com>
In-Reply-To: <4BAB71AB.4060500@oracle.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [nfsv4] what does NFSv4.x look like and how is it managed...
Thread-Index: AcrMJy4nMe4a5X29QLir421ssL0wDwAHC04w
From: Noveck_David@emc.com
To: brian.l.wong@oracle.com, sfaibish@popimap.lss.emc.com
X-OriginalArrivalTime: 25 Mar 2010 17:54:44.0755 (UTC) FILETIME=[409FFE30:01CACC44]
X-EMM-EM: Active
Cc: nfsv4@ietf.org
Subject: Re: [nfsv4] what does NFSv4.x look like and how is it managed...
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 25 Mar 2010 17:56:14 -0000

> I realize that the mds-to-ds communication is deliberately 
> not specified, but I think we need to be a bit more proactive 
> than that. 

The details of how the communications is done is not specified but the
requirements of what needs to be communicated can and should be
specified.  And further, the implementer is free to communicate more
information to make the system serviceable than the spec requires and I
expect typical implementations to do so.

The client-ds and client-mds communication is specified (and thus may be
unduly limited) and here we may need an expansion of what is
communicated to make the system more usable in the face of various
failures and misconfigurations.

I take your point that this should not be limited to permission issues.

-----Original Message-----
From: nfsv4-bounces@ietf.org [mailto:nfsv4-bounces@ietf.org] On Behalf
Of Brian Wong
Sent: Thursday, March 25, 2010 10:23 AM
To: faibish, sorin
Cc: nfsv4 nfsv4
Subject: Re: [nfsv4] what does NFSv4.x look like and how is it
managed...

sfaibish wrote:
> Per our discussion during my Permission Access presentation to nfs4 WG

> I want
> to ask the WG to review the draft and come back with comments:
>
http://www.ietf.org/id/draft-faibish-nfsv4-pnfs-access-permissions-check
-02.txt 
>
I am sympathetic to the fundamental idea that there has to be some 
greater client/mds/ds communication in the event of errors. I realize 
that the mds-to-ds communication is deliberately not specified, but I 
think we need to be a bit more proactive than that. In a system as 
complex as pNFS, at least some of the error handling really needs to 
include all of the interested parties. I'm happy to help draft something

in this space, whether that's an amendment to this proposal or something

new.

However, I am uncomfortable with the focus on permissions in this 
specific draft. Permissions are certainly an issue, and especially since

some of the problematic things are sometimes well outside of the span of

control of the likely administrator of a pNFS name space. (ie, FC 
switches are often administered by the networking organization, while 
the arrays typically are under the authority of the storage 
administrators; this division is even more prevalent if the storage 
protocol is iSCSI.)

However, there are many other errors that are architecturally similar, 
for example network partitions, where client-DS or MDS-DS is interrupted

but the other path remains.  Or a node may have shut down due to thermal

overload (or a crash). These errors must be handled too, and I suspect 
that there is an entire class of problems that is begging to be handled 
here.

Hopefully this isn't opening up Pandora's box. I'm not yet concerned 
about the protocol being too chatty as (a) I would expect the chatting 
to occur only upon error and (b) we should see what it looks like before

we decide that it's too chatty.

On the draft itself, how do we propose to handle transitory errors? It 
is clear how permissions would be withdrawn, but I don't see how the 
situation would be reversed. Let's say that an array administrator 
erroneously sets a LUN mask to exclude all. It's clear that layouts will

be recalled. Once the administrator corrects the problem, does the 
client have to re-issue the open? Suppose the problem is - effectively -

permissions - but in the form of a zoning error in the switch? (This one

might be as transitory as a reboot of an unfortunately located switch.) 
This will have some significant interaction with the notion of removing 
an "offending" device from the list of valid devices.

In section 3.3.3, what happens if the CB_LAYOUTACCESSCHECK returns 
inconsistent information? For example, if client A has a LUN mask to the

SD, but client B does not?

Presumably in section 4.2 "client ... cannot write to the storage 
devices" really should be "access the storage devices" or is it a syntax

error to have read-only access? I would have thought that a read-only 
open would not be at odds with lacking write access to the device.

blw
>
> I would like also to ask to schedule some time to duiscuss this draft
in
> the NFSv4 in one of the Thursday FedFS calls. 

_______________________________________________
nfsv4 mailing list
nfsv4@ietf.org
https://www.ietf.org/mailman/listinfo/nfsv4