Re: [nfsv4] FW: I-D ACTION:draft-faibish-nfsv4-pnfs-access-permissions-check-03.txt
Tom Haynes <tom.haynes@oracle.com> Mon, 12 July 2010 20:21 UTC
Return-Path: <tom.haynes@oracle.com>
X-Original-To: nfsv4@core3.amsl.com
Delivered-To: nfsv4@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 2976F3A6C71 for <nfsv4@core3.amsl.com>; Mon, 12 Jul 2010 13:21:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.407
X-Spam-Level:
X-Spam-Status: No, score=-6.407 tagged_above=-999 required=5 tests=[AWL=0.192, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Ap7XP45V2-Aa for <nfsv4@core3.amsl.com>; Mon, 12 Jul 2010 13:21:03 -0700 (PDT)
Received: from rcsinet10.oracle.com (rcsinet10.oracle.com [148.87.113.121]) by core3.amsl.com (Postfix) with ESMTP id D4EF13A6C45 for <nfsv4@ietf.org>; Mon, 12 Jul 2010 13:21:02 -0700 (PDT)
Received: from rcsinet15.oracle.com (rcsinet15.oracle.com [148.87.113.117]) by rcsinet10.oracle.com (Switch-3.4.2/Switch-3.4.2) with ESMTP id o6CKL6ZL019348 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Mon, 12 Jul 2010 20:21:08 GMT
Received: from acsmt354.oracle.com (acsmt354.oracle.com [141.146.40.154]) by rcsinet15.oracle.com (Switch-3.4.2/Switch-3.4.1) with ESMTP id o6CKL0Pk001900; Mon, 12 Jul 2010 20:21:06 GMT
Received: from abhmt005.oracle.com by acsmt354.oracle.com with ESMTP id 398845071278965970; Mon, 12 Jul 2010 13:19:30 -0700
Received: from [192.168.2.6] (/98.184.164.41) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Mon, 12 Jul 2010 13:19:30 -0700
Message-ID: <4C3B78CE.6080302@oracle.com>
Date: Mon, 12 Jul 2010 15:19:26 -0500
From: Tom Haynes <tom.haynes@oracle.com>
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
To: david.black@emc.com
References: <C2D311A6F086424F99E385949ECFEBCB031892DE@CORPUSMX80B.corp.emc.com>
In-Reply-To: <C2D311A6F086424F99E385949ECFEBCB031892DE@CORPUSMX80B.corp.emc.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Source-IP: acsmt354.oracle.com [141.146.40.154]
X-Auth-Type: Internal IP
X-CT-RefId: str=0001.0A090207.4C3B7932.018C:SCFMA4539814,ss=1,fgs=0
Cc: nfsv4@ietf.org
Subject: Re: [nfsv4] FW: I-D ACTION:draft-faibish-nfsv4-pnfs-access-permissions-check-03.txt
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 12 Jul 2010 20:21:05 -0000
david.black@emc.com wrote: > This is the revised permissions check draft - it actually deals with any > circumstance under which a client cannot access a pNFS data server and > wants to report that inaccessibility. > > Thanks, > --David > > 1) This: To the extent that an MDS can determine whether storage devices are accessible to clients, an MDS SHOULD NOT include a storage device in any pNFS layouts sent to a client that cannot access that storage device. At a minimum, the server SHOULD perform these storage device accessibility checks before exporting a filesystem that supports pNFS and when the device configuration for such an exported filesystem is changed (e.g., to add a storage device). implies to me that the MDS has to keep track of LAYOUT4_RET_REC_FSID_NO_ACCESS and LAYOUT4_RET_REC_FILE_NO_ACCESS layout return types per client. I.e., once it knows a client has problems with a specific storage device, it should avoid using that device again. Given that we need this mechanism for the client to report errors, how then does the server know when it can start using these storage devices again? Even if the MDS knows a positive change took place, it has to rely on the client to do the checking. Is this a difference between MUST and SHOULD? I.e., does having SHOULD mean that the MDS can hand out the storage devices again to see if the client can suddenly start using them again? 2) Is it NFS4ERR_PERM or NFS4ERR_ACCESS for access permission denial? o NFS4ERR_PERM SHOULD be used for access permission denial; and From 5661: 15.1.6.1. NFS4ERR_ACCESS (Error Code 13) Indicates permission denied. The caller does not have the correct permission to perform the requested operation. Contrast this with NFS4ERR_PERM (Section 15.1.6.2), which restricts itself to owner or privileged-user permission failures, and NFS4ERR_WRONG_CRED (Section 15.1.6.4), which deals with appropriate permission to delete or modify transient objects based on the credentials of the user that created them. 15.1.6.2. NFS4ERR_PERM (Error Code 1) Indicates requester is not the owner. The operation was not allowed because the caller is neither a privileged user (root) nor the owner of the target of the operation. Since this document talks about mount issues, I went back to RFC 1813 and the MOUNT protocol. MNT can return MNT3ERR_ACCES if the client does not have access rights to the export. I think NFS4ERR_ACCESS is more consistent with prior protocols than NFS4ERR_PERM. Going back to this draft, in section 3.3, I see: There are two NO_ACCESS layoutreturn_type4 values that indicate lack of storage device access, LAYOUT4_RET_REC_FSID_NO_ACCESS and LAYOUT4_RET_REC_FILE_NO_ACCESS. and An NFS error (nfsstat4) is included in the layoutreturn data structures for these two types to distinguish access permission problems from device inaccessibility: I think access has been overloaded here and to clarify things, NFS4ERR_PERM is selected over NFS4ERR_ACCESS. The only other reasons I can see for using NFS4ERR_PERM instead of NFS4ERR_ACCESS are related to security: a) if the user credentials were insufficient, i.e., kerberized access to the storage device failed. b) Section 13.12 of 5661: If the metadata server would deny a READ or WRITE operation on a file due to its ACL, mode attribute, open access mode, open deny mode, mandatory byte-range lock state, or any other attributes and state, the data server MUST also deny the READ or WRITE operation. Which seems to point out a need for error codes for: a) No access granted (mount for files, block devices have other means) b) Permission denied for the operation. c) Permission denied because of security. The difference between b) and c) is that b) is per fileid and c) is per fsid. So a MDS could still use the storage device in the layout for b), but should avoid using it for c). 3) I find: An NFS error (nfsstat4) is included in the layoutreturn data structures for these two types to distinguish access permission problems from device inaccessibility: o NFS4ERR_PERM SHOULD be used for access permission denial; and o NFS4ERR_NXIO SHOULD be used for inability to access a device. Other NFS errors MAY be used when they are appropriate. All uses of these two layout return types that report errors SHOULD be logged by the client. to be under-specified. What are the other errors that a server can see and how is it supposed to react to those errors? I'd like to see language about which errors are MANDATORY to be supported and which are OPTIONAL. I know I can read the above to see that there are only two MANDATORY ones, but I can also read it to see that all are MANDATORY. I don't want clients shoehorning every error code back into these two. And I do want clarification on what a server should do with the OPTIONAL codes. I.e. is it free to reuse those storage devices the next time that client asks for a layout? 4) What if the storage device A returns NFS4ERR_STALE to the client while storage device B returns NFS4_OK for an operation on the same layout? But for a different file with the same layout, it is given NFS4_OK from both DSs. This isn't necessarily either a permission issue nor a device inaccessibility issue. (It could be either: the export was changed or the filesystem indicated by the filehandle does not exist. It could also just mean that the indicated file does not exist.) Would this be where LAYOUT4_RET_REC_FILE_NO_ACCESS and NFS4ERR_NXIO is appropriate? Which leads to an even more interesting question, what constitutes a NFS4ERR_NXIO error? As NFS4ERR_NXIO is defined by 5661 to not be a valid return code from any operation or CB, I would take it to mean simply that the storage device is not responding to the client. If the client got any error back from the storage device, then it could not use NFS4ERR_NXIO. As the NFS4ERR_STALE could simply mean that the filehandle refers to a file which doesn't exist, is it appropriate to inform the server to no longer use that storage device in the layouts assigned to that client? Where this is going is perhaps now is a good time to add more informative error codes from the storage device to the client. This would in turn allow the client to send these back to the MDS. I.e., there is a world of difference between this device (fsid) does not exist and this file does not exist.