Re: [nfsv4] draft-hellwig-nfsv4-scsi-layout-00.txt

bfields@fieldses.org (J. Bruce Fields) Tue, 17 March 2015 00:30 UTC

Return-Path: <bfields@fieldses.org>
X-Original-To: nfsv4@ietfa.amsl.com
Delivered-To: nfsv4@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A2CA81ACD91 for <nfsv4@ietfa.amsl.com>; Mon, 16 Mar 2015 17:30:51 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 0.788
X-Spam-Level:
X-Spam-Status: No, score=0.788 tagged_above=-999 required=5 tests=[BAYES_50=0.8, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id CRm3S4UgdSGI for <nfsv4@ietfa.amsl.com>; Mon, 16 Mar 2015 17:30:49 -0700 (PDT)
Received: from fieldses.org (fieldses.org [IPv6:2600:3c00::f03c:91ff:fe50:41d6]) by ietfa.amsl.com (Postfix) with ESMTP id EBB8C1ACD87 for <nfsv4@ietf.org>; Mon, 16 Mar 2015 17:30:48 -0700 (PDT)
Received: by fieldses.org (Postfix, from userid 2815) id 0B6B33D1B; Mon, 16 Mar 2015 20:30:48 -0400 (EDT)
Date: Mon, 16 Mar 2015 20:30:48 -0400
To: "Black, David" <david.black@emc.com>
Message-ID: <20150317003048.GA18854@fieldses.org>
References: <E721E887-7E17-4FC4-9CDB-19A2CB933ABE@primarydata.com> <20150309203020.GA21585@lst.de> <CE03DB3D7B45C245BCA0D243277949363DA09C@MX104CL02.corp.emc.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <CE03DB3D7B45C245BCA0D243277949363DA09C@MX104CL02.corp.emc.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
From: bfields@fieldses.org
Archived-At: <http://mailarchive.ietf.org/arch/msg/nfsv4/xyHRX_9fVqLI-USjp2X9cmar-ZI>
Cc: Christoph Hellwig <hch@lst.de>, NFSv4 <nfsv4@ietf.org>
Subject: Re: [nfsv4] draft-hellwig-nfsv4-scsi-layout-00.txt
X-BeenThere: nfsv4@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: NFSv4 Working Group <nfsv4.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/nfsv4/>
List-Post: <mailto:nfsv4@ietf.org>
List-Help: <mailto:nfsv4-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/nfsv4>, <mailto:nfsv4-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 17 Mar 2015 00:30:51 -0000

On Tue, Mar 10, 2015 at 12:56:17AM +0000, Black, David wrote:
> > > 4) And to be clear, you have had an external review of the SCSI usage? By
> > David and Robert?
> > 
> > Yes.
> 
> That was a while ago - I looked at the draft before Christoph submitted it.
> Looking back at the email then, I think I wanted to see more details on the
> use of SCSI Persistent Reservations.  I should go look at the posted version
> with this and my previous comments to Christoph in mind.

Did you get a chance to look at this yet?

The problems Christoph describes here really are annoying for any block
pnfs implementation, so I'm anxious to see them resolved.

I'm not a scsi expert and haven't tried to read the relevant specs, so
I'm assuming that this is indeed a sensible way to address and fence
devices, and that any scsi protocol features required are sufficiently
ubiquitous to be useful.

Other than that my comments are pretty minor.  For totally trivial
spelling and other stuff I've also appended a diff against
git://git.infradead.org/users/hch/scsi-layout.git.


Remove section 1.3: if somebody really needs these definitions, the
right place would be 5661 or 5663.

"The layout4 type is defined in RFC5662 as follows".  That's not exactly
true.  Might be clearer to say: "the layout4 type defined in [RFC5662]
is extended with a new value as follows:"  (Same comment applies to
http://tools.ietf.org/html/draft-bhalevy-nfsv4-flex-files-03.)

Looks like both sbv_lu_name and svb_designator are used to refer to the
same field.

Does sbv_lu_name<> have some maximum size?

"revoke extents from one client in order to transfer the extents to
another client".  This language is a cut-and-paste from 5563.  (And not
entirely accurate--there's othe reasons to recall a layout.)  If only
for attribution reason, make that an explicit quote.

But actually I would consider just cutting the first couple sentences
and say:

	[RFC 5663] suggests fencing clients using either LUN masking or
	cooperative clients....

"common notation of a client" doesn't sound right.  How about just
"common address for a client"?  Or "common name for a client"?

"To allow fencing individual systems, each system MUST use a unique
Persistent Reservation key."  I think that MUST should be lower case.
You're describing a requirement on the protocol not really a requirement
for any specific client or server behavior yet.

"After a successful GETDEVICEINFO operation the client MUST register the
registration key returned in sbv_pr_key with the storage device..." Must
it really?  Couldn't it change its mind and decide it doesn't want to
use that device or layout after all?  And in any case, if it skips the
registration, the IO will just fail, so I don't know that we really need
to make an explicit requirement here.  As a description of typical use
it's still helpful, so maybe word it something like: "Before performing
the first IO to a device returned from a GETDEVICEINFO operation the
client will register the registration key...".

But if I understand right the MUST for unregistering is important as
otherwise the registration just sits around until there's a conflict and
the server may have to unnecessarily delay conflicting IO, etc.

I assume this applies only if the client previously registered the key.
So maybe: "When a client stops using a device it previously registered,
it MUST..."  Also: "stops using" is a little vague.

"A client that detects I/O errors".... Is that any I/O error or just I/O
errors due to fencing?

"all outstanding layouts": why not just affected layouts?

"forget any open devices": again, what's scope of "any"?  What does
"forget" mean?  (Should it try to unregister?)

"by a future LAYOUTGET": should that be: by a future GETDEVICEINFO?  Or
by a future GETDEVICEINFO for this device?  Do we also need to instruct
the MDS to generate a new reservation key?

--b.

diff --git a/scsi_middle_introduction.xml b/scsi_middle_introduction.xml
index 09ff224ef731..3b8d5d376f66 100644
--- a/scsi_middle_introduction.xml
+++ b/scsi_middle_introduction.xml
@@ -24,7 +24,7 @@
       structures for the LAYOUTGET and LAYOUTCOMMIT operations.
 
       This document does not directly interact with <xref target='RFC6688' />,
-      although the mechanisms described in this document also archive the
+      although the mechanisms described in this document also achieve the
       goals of <xref target='RFC6688' />, and do so in a more robust fashion
       that does not depend on the cooperation of the systems involved. Thus,
       the mechanisms specified in <xref target='RFC6688' /> are not necessary
diff --git a/scsi_middle_layout.xml b/scsi_middle_layout.xml
index 8501f231762a..330d73160480 100644
--- a/scsi_middle_layout.xml
+++ b/scsi_middle_layout.xml
@@ -117,8 +117,7 @@
 	field does not need to be set separately.  Only certain combinations
 	of "sbv_code_set" and "sbv_designator_type" are valid, please refer to
 	<xref target="SPC3" /> for details, and note that ASCII may be used
-	as the code set for UTF-8 text that does not contain non-ASCII
-	characters.
+	as the code set for UTF-8 text that contains only ASCII characters.
 
 	Note that a Device Identification VPD page MAY contain multiple
 	descriptors with the same association, code set and designator type.
@@ -191,7 +190,7 @@
 
       <t>
         All rules for ordering and formation of a "pnfs_scsi_deviceaddr4"
-	structure are identical for those of a "pnfs_block_deviceaddr4"
+	structure are identical to those for a "pnfs_block_deviceaddr4"
 	structure in <xref target='RFC5663' />, except that the new
 	pnfs_scsi_base_volume_info4 PNFS_SCSI_VOLUME_BASE case is used in
 	place of the pnfs_block_simple_volume_info4 PNFS_BLOCK_VOLUME_SIMPLE
@@ -227,21 +226,21 @@
 	of type "Exclusive Access – All Registrants" on each SCSI logical unit
 	exported to pNFS clients the MDS prevents access from any client that
 	does not have an outstanding device device ID that gives the client
-	a reservation key to access to logical unit, and allows the MDS to
+	a reservation key to access the logical unit, and allows the MDS to
 	revoke access to the logic unit at any time.
       </t>
 
       <section anchor='ssc:fencing:keys'
 		title='PRs - Key Generation'>
       <t>
-	To allow fencing individual systems, each systems MUST use a unique
+	To allow fencing individual systems, each system MUST use a unique
 	Persistent Reservation key.  <xref target="SPC3" /> does not specify
 	a way to generate keys.  This document assigns the burden to generate
 	unique keys to the MDS, which MUST generate a key for itself before
-	exporting a volume, and one for each client that access a volume.
+	exporting a volume, and one for each client that accesses a volume.
 	The MDS MAY either generate a key for each client that accesses
-	logic units exported by the MDS, or to generate a key for each
-	[logical unit, client] combination.  In case of a single key per
+	logic units exported by the MDS, or generate a key for each
+	[logical unit, client] combination.  If using a single key per
 	client, the MDS needs to be aware of the per-client fencing
 	granularity.
       </t>
@@ -257,7 +256,7 @@
 	action of "REGISTER", followed by a "PERSISTENT RESERVE OUT" command,
 	with a service action of "RESERVE" and the type field set to 8h
 	(Exclusive Access – All Registrants).
-	To make sure all I_T nexus are registered, the MDS SHOULD set the
+	To make sure all I_T nexuses are registered, the MDS SHOULD set the
 	"All Target Ports" (ALL_TG_PT) bit when registering the key, or
 	otherwise ensure the registration is performed for each initiator port.
       </t>
@@ -290,16 +289,16 @@
 	  In case of a non-responding client the MDS MUST fence the client
 	  by issuing a "PERSISTENT RESERVE OUT" command with the service
 	  action set to "PREEMPT" or "PREEMPT AND ABORT", the reservation key
-	  field set to the server's reservation key, and the service action
+	  field set to the server's reservation key, the service action
 	  reservation key field set to the reservation key associated with
 	  the non-responding client, and the type field set to 8h (Exclusive
 	  Access – All Registrants).
         </t>
 
 	<t>
-	  After the MDS preempted a client, all client I/O to the logical unit
-	  will fail.  The client should at this point return any layout that
-	  refers to the device ID that poіnts to the logical unit.  Note that
+	  After the MDS preempts a client, all client I/O to the logical unit
+	  fails.  The client should at this point return any layout that
+	  refers to the device ID that points to the logical unit.  Note that
 	  the client can distinguish I/O errors due to fencing from other
 	  errors based on the "RESERVATION CONFLICT" status.  Refer to
 	  <xref target="SPC3" /> for details.