Re: [Gen-art] GenART review of draft-ietf-storm-iscsi-sam-06

"Black, David" <david.black@emc.com> Mon, 13 August 2012 21:33 UTC

Return-Path: <david.black@emc.com>
X-Original-To: gen-art@ietfa.amsl.com
Delivered-To: gen-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 0030A21F84E4 for <gen-art@ietfa.amsl.com>; Mon, 13 Aug 2012 14:33:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.486
X-Spam-Level:
X-Spam-Status: No, score=-102.486 tagged_above=-999 required=5 tests=[AWL=0.113, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id eFuJjIxIz6Q6 for <gen-art@ietfa.amsl.com>; Mon, 13 Aug 2012 14:33:30 -0700 (PDT)
Received: from mexforward.lss.emc.com (hop-nat-141.emc.com [168.159.213.141]) by ietfa.amsl.com (Postfix) with ESMTP id B16F121F853B for <gen-art@ietf.org>; Mon, 13 Aug 2012 14:33:29 -0700 (PDT)
Received: from hop04-l1d11-si01.isus.emc.com (HOP04-L1D11-SI01.isus.emc.com [10.254.111.54]) by mexforward.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id q7DLXKhk005454 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Mon, 13 Aug 2012 17:33:24 -0400
Received: from mailhub.lss.emc.com (mailhub.lss.emc.com [10.254.221.251]) by hop04-l1d11-si01.isus.emc.com (RSA Interceptor); Mon, 13 Aug 2012 17:33:00 -0400
Received: from mxhub15.corp.emc.com (mxhub15.corp.emc.com [128.222.70.236]) by mailhub.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id q7DLWwk3028758; Mon, 13 Aug 2012 17:32:58 -0400
Received: from mx15a.corp.emc.com ([169.254.1.189]) by mxhub15.corp.emc.com ([128.222.70.236]) with mapi; Mon, 13 Aug 2012 17:32:58 -0400
From: "Black, David" <david.black@emc.com>
To: Martin Thomson <martin.thomson@gmail.com>, "draft-ietf-storm-iscsi-sam.all@tools.ietf.org" <draft-ietf-storm-iscsi-sam.all@tools.ietf.org>, "gen-art@ietf.org" <gen-art@ietf.org>
Date: Mon, 13 Aug 2012 17:32:56 -0400
Thread-Topic: [Gen-art] GenART review of draft-ietf-storm-iscsi-sam-06
Thread-Index: Ac1p7ztS7WE09KUwRUae6j5CKR0etQPp56Pw
Message-ID: <8D3D17ACE214DC429325B2B98F3AE71208F127D3@MX15A.corp.emc.com>
References: <CABkgnnXUW4Lq3uyjd_4-z=01ED3=S9Q1Ui2NP-=LVA7sQTrMDw@mail.gmail.com>
In-Reply-To: <CABkgnnXUW4Lq3uyjd_4-z=01ED3=S9Q1Ui2NP-=LVA7sQTrMDw@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-EMM-MHVC: 1
Cc: Mallikarjun Chadalapaka <cbm@chadalapaka.com>, Tom Talpey <ttalpey@microsoft.com>
Subject: Re: [Gen-art] GenART review of draft-ietf-storm-iscsi-sam-06
X-BeenThere: gen-art@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <gen-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/gen-art>, <mailto:gen-art-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/gen-art>
List-Post: <mailto:gen-art@ietf.org>
List-Help: <mailto:gen-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/gen-art>, <mailto:gen-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 13 Aug 2012 21:33:32 -0000

Martin,

Thank you very much for this review - you found a number of things
that need attention, but there are also a few comments that
(probably unbeknownst to you) represent design decisions that
cannot be revisited courtesy of the amount of deployed iSCSI
"running code".

Responses are inline below.

Thanks,
--David

> -----Original Message-----
> From: gen-art-bounces@ietf.org [mailto:gen-art-bounces@ietf.org] On Behalf Of
> Martin Thomson
> Sent: Tuesday, July 24, 2012 6:54 PM
> To: draft-ietf-storm-iscsi-sam.all@tools.ietf.org; gen-art@ietf.org
> Subject: [Gen-art] GenART review of draft-ietf-storm-iscsi-sam-06
> 
> I have been selected as the General Area Review Team (Gen-ART)
> reviewer for this draft (for background on Gen-ART, please see
> http://www.alvestrand.no/ietf/gen/art/gen-art-FAQ.html).
> 
> Please wait for direction from your document shepherd
> or AD before posting a new version of the draft.
> 
> Document: draft-ietf-storm-iscsi-sam-06
> Reviewer: Martin Thomson
> Review Date: 2012-07-24
> IESG Telechat date: ?
> 
> Summary: This document has a few issues that would need to be
> addressed before it is published as a Standards Track RFC.
> 
> Major Issues: None
> 
> Minor Issues:
> 
> Is the intent to reference RFC 3720 or the new -cons draft (as
> published)?  Given that -cons is intended to obsolete 3720, I would
> have expected no reference to 3720 at all.

That's a correct expectation, and results from some confusion over what the
final status of RFC 3720 would be.  As RFC 3720 is being obsoleted by the
-cons draft, RFC3720 cannot be normatively referenced by either the -sam or
-cons drafts, and hence most of the citations in the body of the -sam draft
need to be to the -cons draft.

> Section 5.1 / 5.1.1 does not define how many bits the PRI field
> consumes.  I might infer from the diagram that is the four most
> significant bits of the second byte, but the diagram is unclear.  In
> any case, the diagram should only be used for illustrative purposes
> and the text should provide the definitive answer.
> 
> Section 5.2.1 also relies on the diagram.  Though in this case it is
> clearer and I can easily infer that the status qualifier is one byte,
> I'm still relying on the diagram.

These two are minor, but I have no problem with changing them as suggested.

> Why is Section 6.2.1 not Section 5.1.2?  Same for 6.2.2 and 6.2.3.
> Aren't these new (?) additions to the Command PDU?  Or is there
> another PDU that these apply to?

There's another PDU that these apply to, the Task Management Function
Request PD - its layout diagram needs to be copied into the -sam
draft from section 11.5 of the -cons draft so that the -sam draft is
somewhat more self-contained.
 
> Section 7.1.1 What does it mean to not claim support for any
> particular level?  Obviously you can't use the features described
> here, but what else?

It just means that the system is running "iSCSI" without providing
any additional details.  The ability to announce support without
claiming support for any particular level is a SCSI feature that is
applied to all SCSI-related standards.  The other party in the
communication needs to make least common denominator assumptions
about supported functionality, subject to further checks that may
provide additional information (e.g., iSCSI text key negotiation,
read a particular SCSI VPD page).

> Section 8 makes a bold claim that needs substantiation.  Given the
> global nature of task identifiers, is it necessary to prevent one user
> from querying a task that another created?

No it is not - to the extent that iSCSI has a notion of user, that
notion is session-wide across all connections.

> What about triggering reset on a I_T nexus that another user is using?

The I_T NEXUS RESET function in the -sam draft can't do that because that
function is reflexive - it resets the I_T nexus on which the function is
sent.  Yes, this really is a SCSI version of "shoot yourself in the foot"
:-) and it is useful because target systems generally respond by cleaning
out all of the state related to the nexus which can clean up some situations.

Nonetheless, there are already SCSI-defined means in iSCSI to take
out other I_T nexuses, and the security measures for those (primarily
configuration-based access control) are handled at the SCSI level.

Something else to keep in mind is that the usual notion of "user" as a person
does not apply to SCSI and iSCSI.  The initiator is typically an operating
system driver or iSCSI offload HBA, and hence the identity is a system
identity as opposed to a user identity.
 
> Questions (for -cons):
> 
> Section 6.2 makes the following requirement:
>    If the connection is still active (it is not undergoing an
>    implicit or explicit logout), QUERY TASK MUST be issued on the
>    same connection to which the task to be queried is allegiant at
>    the time the Task Management request is issued. If the
>    connection is implicitly or explicitly logged out (i.e., no other
>    request will be issued on the failing connection and no other
>    response will be received on the failing connection), then a
>    QUERY TASK function request may be issued on another connection.
>    This Task Management request will then establish a new allegiance
>    for the command being queried.
> 
> The comment is probably more for of -cons (Section 4.2.5.1 and Section
> 7.2.2).  Obviously, this design is long-standing, and I'm only reading
> parts of the specification.

And there's a lot of "running code" that uses this design.
 
> There is an implication that tasks have identification that is
> globally unique, even if the normal behaviour is to bind (ally) a
> request with a connection.  The reasons for allowing this are not
> explained, but it imposes some fragility on the system.  For instance,
> the requirement that the old connection be closed with a "remove the
> connection for recovery" seems counter to the goal of failure
> recovery.  If the point of this is for failure recovery, then a
> primary failure mode would be network failure - in which case this
> mechanism cannot be used.

There are multiple notions of "global" involved here.  Task identity is
global across the session, but not across independent sessions.  SCSI
multipathing (e.g., as used with SANs) involves independent iSCSI sessions
across which there is no shared iSCSI state.  The primary motivation for
this iSCSI failure recovery mechanism is partial network failure, e.g., the
two TCP connections are assigned to two different server NICs on the same
network, and one of those NICs fails.  Complete network failure is not 
a motivation, as that would be handled at a higher level, e.g., by
SAN-oriented SCSI multipathing across iSCSI sessions on different networks,
and obviously, there's not much that can be done if all network connectivity
is lost between a host and its storage.
 
> What purpose does this allegiance feature address?

Isolates connection failure effects to I/Os that use the connection as
opposed to making them global.  It's not ideal, in that the CmdSN window
will likely run out by the time that the failure is detected and recovered
from, but at least failure of one connection doesn't stop all I/Os in
their tracks.  This also avoids a functionality shift when moving from
one-TCP-connection sessions (which are common) to multiple-TCP-connection
sessions.
 
> What are the security implications of using allegiance?

Not much that I can immediately think of.  For any individual I/O, the
state and data are on a single connection, but security considerations
generally do not enter into connection selection - load balancing tends
to be the primary consideration.
 
> Nits:
> 
> I don't know what tooling was used to generate this document, but
> there are some strange artifacts.  In particular, diagrams and tables
> are misaligned in a number of places.

Guilty as charged (sorry), and there's actually a mistake in the mapping
table courtesy of another tool glitch.  A more careful check will be done
on the next version before it's submitted.  This is partly my fault, as
I leaned on the editor to get this version submitted with a text correction,
not realizing that the tooling would take advantage of the resulting haste
to make waste ...
 
> There are a few terms like "I_T nexus" that are not explained prior to
> use.  These are defined in -cons.  However, I find the existence of a
> terminology mapping table in this draft to be strange.  Wouldn't that
> table be more useful in the "core" document?

Good suggestion, I've cc:'d the author of the -cons draft, and we'll
figure this out in consultation with the WG.

> 
> Section 4.1:
> 
> This seems unnecessarily convoluted:
> 
>       Note that an operational value of "2" or higher for this key on
>       an iSCSI session does not influence the SCSI level features in
>       any way on that I_T nexus. An operational value of "2" or higher
>       for this key permits the iSCSI-related features defined in this
>       document to be used on all connections on this iSCSI session.
>       SCSI level hand-shakes (e.g. commands, mode pages) eventually
>       determine the existence or lack of various SAM-5 features
>       available for the I_T nexus between the two SCSI end points). To
>       summarize, negotiation of this key to "2" or higher is a
>       necessary but not a sufficient condition of SAM-5 compliant
>       feature usage at the SCSI protocol level.
> 
> How about:
> 
>   In addition to negotiating a value of "2" or higher, support for an individual
>   MUST also be signaled using SCSI level hand-shakes prior to use.
> 
> I know that adds 2119 language, but if the goal is to prevent someone
> from attempting to use a feature prior to getting positive
> confirmation of support, then this should be ok.

I think the addition of a "MUST" here is fine, and we'll see about crafting some
language that better reflects the SCSI aspects, as "hand-shakes" is actually
not a common SCSI term.

> Section 6.2 describes multiple tasks.  It would be good if the
> description of each of the tasks were given a sub-section so that they
> could be cross referenced and more readily found.

Unfortunately, RFC 3720 and the consolidated draft were not written that way,
and we'd prefer to keep that structure.

> 
> --Martin
> _______________________________________________
> Gen-art mailing list
> Gen-art@ietf.org
> https://www.ietf.org/mailman/listinfo/gen-art