Re: [nfsv4] draft-hellwig-nfsv4-scsi-layout-00.txt
bfields@fieldses.org (J. Bruce Fields) Tue, 17 March 2015 00:30 UTC
Return-Path: <bfields@fieldses.org>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A2CA81ACD91 for <nfsv4@ietfa.amsl.com>; Mon, 16 Mar 2015 17:30:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.788
X-Spam-Level:
X-Spam-Status: No, score=0.788 tagged_above=-999 required=5 tests=[BAYES_50=0.8, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CRm3S4UgdSGI for <nfsv4@ietfa.amsl.com>; Mon, 16 Mar 2015 17:30:49 -0700 (PDT)
Received: from fieldses.org (fieldses.org [IPv6:2600:3c00::f03c:91ff:fe50:41d6]) by ietfa.amsl.com (Postfix) with ESMTP id EBB8C1ACD87 for <nfsv4@ietf.org>; Mon, 16 Mar 2015 17:30:48 -0700 (PDT)
Received: by fieldses.org (Postfix, from userid 2815) id 0B6B33D1B; Mon, 16 Mar 2015 20:30:48 -0400 (EDT)
Date: Mon, 16 Mar 2015 20:30:48 -0400
To: "Black, David" <david.black@emc.com>
Message-ID: <20150317003048.GA18854@fieldses.org>
References: <E721E887-7E17-4FC4-9CDB-19A2CB933ABE@primarydata.com> <20150309203020.GA21585@lst.de> <CE03DB3D7B45C245BCA0D243277949363DA09C@MX104CL02.corp.emc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CE03DB3D7B45C245BCA0D243277949363DA09C@MX104CL02.corp.emc.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
From: bfields@fieldses.org
Archived-At: <http://mailarchive.ietf.org/arch/msg/nfsv4/xyHRX_9fVqLI-USjp2X9cmar-ZI>
Cc: Christoph Hellwig <hch@lst.de>, NFSv4 <nfsv4@ietf.org>
Subject: Re: [nfsv4] draft-hellwig-nfsv4-scsi-layout-00.txt
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 17 Mar 2015 00:30:51 -0000
On Tue, Mar 10, 2015 at 12:56:17AM +0000, Black, David wrote: > > > 4) And to be clear, you have had an external review of the SCSI usage? By > > David and Robert? > > > > Yes. > > That was a while ago - I looked at the draft before Christoph submitted it. > Looking back at the email then, I think I wanted to see more details on the > use of SCSI Persistent Reservations. I should go look at the posted version > with this and my previous comments to Christoph in mind. Did you get a chance to look at this yet? The problems Christoph describes here really are annoying for any block pnfs implementation, so I'm anxious to see them resolved. I'm not a scsi expert and haven't tried to read the relevant specs, so I'm assuming that this is indeed a sensible way to address and fence devices, and that any scsi protocol features required are sufficiently ubiquitous to be useful. Other than that my comments are pretty minor. For totally trivial spelling and other stuff I've also appended a diff against git://git.infradead.org/users/hch/scsi-layout.git. Remove section 1.3: if somebody really needs these definitions, the right place would be 5661 or 5663. "The layout4 type is defined in RFC5662 as follows". That's not exactly true. Might be clearer to say: "the layout4 type defined in [RFC5662] is extended with a new value as follows:" (Same comment applies to http://tools.ietf.org/html/draft-bhalevy-nfsv4-flex-files-03.) Looks like both sbv_lu_name and svb_designator are used to refer to the same field. Does sbv_lu_name<> have some maximum size? "revoke extents from one client in order to transfer the extents to another client". This language is a cut-and-paste from 5563. (And not entirely accurate--there's othe reasons to recall a layout.) If only for attribution reason, make that an explicit quote. But actually I would consider just cutting the first couple sentences and say: [RFC 5663] suggests fencing clients using either LUN masking or cooperative clients.... "common notation of a client" doesn't sound right. How about just "common address for a client"? Or "common name for a client"? "To allow fencing individual systems, each system MUST use a unique Persistent Reservation key." I think that MUST should be lower case. You're describing a requirement on the protocol not really a requirement for any specific client or server behavior yet. "After a successful GETDEVICEINFO operation the client MUST register the registration key returned in sbv_pr_key with the storage device..." Must it really? Couldn't it change its mind and decide it doesn't want to use that device or layout after all? And in any case, if it skips the registration, the IO will just fail, so I don't know that we really need to make an explicit requirement here. As a description of typical use it's still helpful, so maybe word it something like: "Before performing the first IO to a device returned from a GETDEVICEINFO operation the client will register the registration key...". But if I understand right the MUST for unregistering is important as otherwise the registration just sits around until there's a conflict and the server may have to unnecessarily delay conflicting IO, etc. I assume this applies only if the client previously registered the key. So maybe: "When a client stops using a device it previously registered, it MUST..." Also: "stops using" is a little vague. "A client that detects I/O errors".... Is that any I/O error or just I/O errors due to fencing? "all outstanding layouts": why not just affected layouts? "forget any open devices": again, what's scope of "any"? What does "forget" mean? (Should it try to unregister?) "by a future LAYOUTGET": should that be: by a future GETDEVICEINFO? Or by a future GETDEVICEINFO for this device? Do we also need to instruct the MDS to generate a new reservation key? --b. diff --git a/scsi_middle_introduction.xml b/scsi_middle_introduction.xml index 09ff224ef731..3b8d5d376f66 100644 --- a/scsi_middle_introduction.xml +++ b/scsi_middle_introduction.xml @@ -24,7 +24,7 @@ structures for the LAYOUTGET and LAYOUTCOMMIT operations. This document does not directly interact with <xref target='RFC6688' />, - although the mechanisms described in this document also archive the + although the mechanisms described in this document also achieve the goals of <xref target='RFC6688' />, and do so in a more robust fashion that does not depend on the cooperation of the systems involved. Thus, the mechanisms specified in <xref target='RFC6688' /> are not necessary diff --git a/scsi_middle_layout.xml b/scsi_middle_layout.xml index 8501f231762a..330d73160480 100644 --- a/scsi_middle_layout.xml +++ b/scsi_middle_layout.xml @@ -117,8 +117,7 @@ field does not need to be set separately. Only certain combinations of "sbv_code_set" and "sbv_designator_type" are valid, please refer to <xref target="SPC3" /> for details, and note that ASCII may be used - as the code set for UTF-8 text that does not contain non-ASCII - characters. + as the code set for UTF-8 text that contains only ASCII characters. Note that a Device Identification VPD page MAY contain multiple descriptors with the same association, code set and designator type. @@ -191,7 +190,7 @@ <t> All rules for ordering and formation of a "pnfs_scsi_deviceaddr4" - structure are identical for those of a "pnfs_block_deviceaddr4" + structure are identical to those for a "pnfs_block_deviceaddr4" structure in <xref target='RFC5663' />, except that the new pnfs_scsi_base_volume_info4 PNFS_SCSI_VOLUME_BASE case is used in place of the pnfs_block_simple_volume_info4 PNFS_BLOCK_VOLUME_SIMPLE @@ -227,21 +226,21 @@ of type "Exclusive Access – All Registrants" on each SCSI logical unit exported to pNFS clients the MDS prevents access from any client that does not have an outstanding device device ID that gives the client - a reservation key to access to logical unit, and allows the MDS to + a reservation key to access the logical unit, and allows the MDS to revoke access to the logic unit at any time. </t> <section anchor='ssc:fencing:keys' title='PRs - Key Generation'> <t> - To allow fencing individual systems, each systems MUST use a unique + To allow fencing individual systems, each system MUST use a unique Persistent Reservation key. <xref target="SPC3" /> does not specify a way to generate keys. This document assigns the burden to generate unique keys to the MDS, which MUST generate a key for itself before - exporting a volume, and one for each client that access a volume. + exporting a volume, and one for each client that accesses a volume. The MDS MAY either generate a key for each client that accesses - logic units exported by the MDS, or to generate a key for each - [logical unit, client] combination. In case of a single key per + logic units exported by the MDS, or generate a key for each + [logical unit, client] combination. If using a single key per client, the MDS needs to be aware of the per-client fencing granularity. </t> @@ -257,7 +256,7 @@ action of "REGISTER", followed by a "PERSISTENT RESERVE OUT" command, with a service action of "RESERVE" and the type field set to 8h (Exclusive Access – All Registrants). - To make sure all I_T nexus are registered, the MDS SHOULD set the + To make sure all I_T nexuses are registered, the MDS SHOULD set the "All Target Ports" (ALL_TG_PT) bit when registering the key, or otherwise ensure the registration is performed for each initiator port. </t> @@ -290,16 +289,16 @@ In case of a non-responding client the MDS MUST fence the client by issuing a "PERSISTENT RESERVE OUT" command with the service action set to "PREEMPT" or "PREEMPT AND ABORT", the reservation key - field set to the server's reservation key, and the service action + field set to the server's reservation key, the service action reservation key field set to the reservation key associated with the non-responding client, and the type field set to 8h (Exclusive Access – All Registrants). </t> <t> - After the MDS preempted a client, all client I/O to the logical unit - will fail. The client should at this point return any layout that - refers to the device ID that poіnts to the logical unit. Note that + After the MDS preempts a client, all client I/O to the logical unit + fails. The client should at this point return any layout that + refers to the device ID that points to the logical unit. Note that the client can distinguish I/O errors due to fencing from other errors based on the "RESERVATION CONFLICT" status. Refer to <xref target="SPC3" /> for details.
- Re: [nfsv4] draft-hellwig-nfsv4-scsi-layout-00.txt J. Bruce Fields
- Re: [nfsv4] draft-hellwig-nfsv4-scsi-layout-00.txt Christoph Hellwig
- Re: [nfsv4] draft-hellwig-nfsv4-scsi-layout-00.txt J. Bruce Fields
- Re: [nfsv4] draft-hellwig-nfsv4-scsi-layout-00.txt Christoph Hellwig
- Re: [nfsv4] draft-hellwig-nfsv4-scsi-layout-00.txt J. Bruce Fields
- [nfsv4] draft-hellwig-nfsv4-scsi-layout-00.txt Tom Haynes
- Re: [nfsv4] draft-hellwig-nfsv4-scsi-layout-00.txt Christoph Hellwig
- Re: [nfsv4] draft-hellwig-nfsv4-scsi-layout-00.txt Black, David