Re: [Gen-art] GenART review of draft-ietf-storm-iscsi-sam-06

Martin Thomson <martin.thomson@gmail.com> Mon, 13 August 2012 23:12 UTC

Return-Path: <martin.thomson@gmail.com>
X-Original-To: gen-art@ietfa.amsl.com
Delivered-To: gen-art@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 194B021F867B for <gen-art@ietfa.amsl.com>; Mon, 13 Aug 2012 16:12:19 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.984
X-Spam-Level:
X-Spam-Status: No, score=-3.984 tagged_above=-999 required=5 tests=[AWL=-0.385, BAYES_00=-2.599, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id C67unGKul3TW for <gen-art@ietfa.amsl.com>; Mon, 13 Aug 2012 16:12:17 -0700 (PDT)
Received: from mail-lpp01m010-f44.google.com (mail-lpp01m010-f44.google.com [209.85.215.44]) by ietfa.amsl.com (Postfix) with ESMTP id 600F721F8499 for <gen-art@ietf.org>; Mon, 13 Aug 2012 16:12:17 -0700 (PDT)
Received: by lahm15 with SMTP id m15so2411202lah.31 for <gen-art@ietf.org>; Mon, 13 Aug 2012 16:12:16 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:in-reply-to:references:date:message-id:subject:from:to :cc:content-type; bh=nXwEQYvg243ak0rJsldHqgkW1btkrfnFrgGTEErGEIo=; b=fAxABXUPVYHsusVks93s2JVsQjnGOa4z6lzPilT7vS80KITsGiR6t8PTFgwJh6gI3Q 3PX2e/zIT76VIJii8eAscfKcROSAtdEGS4rJ2hiE0eHvhA/4EpKYqXRNQ/TgWt7iX9eI ooe86/uUhtTycrzNLUGouL15szZam34ONzsJnYsKPsx0pF4mybqwK7Bq6Aor37q/+BEV dfz20WM7aZbN7IoGnt9a/9/79S6MEWWRaV0CApkX++eT9mRWprfPQeXQlTICna7JtBSr fYk8CJks8EckNH1VXMXqBTu1i+YihPbPpUgVJgdptGLjV+Bv9lWm9l5XQ2mYDjJa4MUS owTQ==
MIME-Version: 1.0
Received: by 10.152.146.169 with SMTP id td9mr10169764lab.42.1344899536201; Mon, 13 Aug 2012 16:12:16 -0700 (PDT)
Received: by 10.112.41.193 with HTTP; Mon, 13 Aug 2012 16:12:16 -0700 (PDT)
In-Reply-To: <8D3D17ACE214DC429325B2B98F3AE71208F127D3@MX15A.corp.emc.com>
References: <CABkgnnXUW4Lq3uyjd_4-z=01ED3=S9Q1Ui2NP-=LVA7sQTrMDw@mail.gmail.com> <8D3D17ACE214DC429325B2B98F3AE71208F127D3@MX15A.corp.emc.com>
Date: Mon, 13 Aug 2012 16:12:16 -0700
Message-ID: <CABkgnnUn72e1tExLH9szf-fGaXeQuL4bgp8P3uBodyw+-P2aJw@mail.gmail.com>
From: Martin Thomson <martin.thomson@gmail.com>
To: "Black, David" <david.black@emc.com>
Content-Type: text/plain; charset="UTF-8"
Cc: "draft-ietf-storm-iscsi-sam.all@tools.ietf.org" <draft-ietf-storm-iscsi-sam.all@tools.ietf.org>, "gen-art@ietf.org" <gen-art@ietf.org>, Tom Talpey <ttalpey@microsoft.com>, Mallikarjun Chadalapaka <cbm@chadalapaka.com>
Subject: Re: [Gen-art] GenART review of draft-ietf-storm-iscsi-sam-06
X-BeenThere: gen-art@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "GEN-ART: General Area Review Team" <gen-art.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/gen-art>, <mailto:gen-art-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/gen-art>
List-Post: <mailto:gen-art@ietf.org>
List-Help: <mailto:gen-art-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/gen-art>, <mailto:gen-art-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 13 Aug 2012 23:12:19 -0000

David,

I think that you successfully covered most of the concerns.  I fully
expected there to be a number of decisions that would be immutable for
that reason.  I'll look for the update.

--Martin

On 13 August 2012 14:32, Black, David <david.black@emc.com> wrote:
> Martin,
>
> Thank you very much for this review - you found a number of things
> that need attention, but there are also a few comments that
> (probably unbeknownst to you) represent design decisions that
> cannot be revisited courtesy of the amount of deployed iSCSI
> "running code".
>
> Responses are inline below.
>
> Thanks,
> --David
>
>> -----Original Message-----
>> From: gen-art-bounces@ietf.org [mailto:gen-art-bounces@ietf.org] On Behalf Of
>> Martin Thomson
>> Sent: Tuesday, July 24, 2012 6:54 PM
>> To: draft-ietf-storm-iscsi-sam.all@tools.ietf.org; gen-art@ietf.org
>> Subject: [Gen-art] GenART review of draft-ietf-storm-iscsi-sam-06
>>
>> I have been selected as the General Area Review Team (Gen-ART)
>> reviewer for this draft (for background on Gen-ART, please see
>> http://www.alvestrand.no/ietf/gen/art/gen-art-FAQ.html).
>>
>> Please wait for direction from your document shepherd
>> or AD before posting a new version of the draft.
>>
>> Document: draft-ietf-storm-iscsi-sam-06
>> Reviewer: Martin Thomson
>> Review Date: 2012-07-24
>> IESG Telechat date: ?
>>
>> Summary: This document has a few issues that would need to be
>> addressed before it is published as a Standards Track RFC.
>>
>> Major Issues: None
>>
>> Minor Issues:
>>
>> Is the intent to reference RFC 3720 or the new -cons draft (as
>> published)?  Given that -cons is intended to obsolete 3720, I would
>> have expected no reference to 3720 at all.
>
> That's a correct expectation, and results from some confusion over what the
> final status of RFC 3720 would be.  As RFC 3720 is being obsoleted by the
> -cons draft, RFC3720 cannot be normatively referenced by either the -sam or
> -cons drafts, and hence most of the citations in the body of the -sam draft
> need to be to the -cons draft.
>
>> Section 5.1 / 5.1.1 does not define how many bits the PRI field
>> consumes.  I might infer from the diagram that is the four most
>> significant bits of the second byte, but the diagram is unclear.  In
>> any case, the diagram should only be used for illustrative purposes
>> and the text should provide the definitive answer.
>>
>> Section 5.2.1 also relies on the diagram.  Though in this case it is
>> clearer and I can easily infer that the status qualifier is one byte,
>> I'm still relying on the diagram.
>
> These two are minor, but I have no problem with changing them as suggested.
>
>> Why is Section 6.2.1 not Section 5.1.2?  Same for 6.2.2 and 6.2.3.
>> Aren't these new (?) additions to the Command PDU?  Or is there
>> another PDU that these apply to?
>
> There's another PDU that these apply to, the Task Management Function
> Request PD - its layout diagram needs to be copied into the -sam
> draft from section 11.5 of the -cons draft so that the -sam draft is
> somewhat more self-contained.
>
>> Section 7.1.1 What does it mean to not claim support for any
>> particular level?  Obviously you can't use the features described
>> here, but what else?
>
> It just means that the system is running "iSCSI" without providing
> any additional details.  The ability to announce support without
> claiming support for any particular level is a SCSI feature that is
> applied to all SCSI-related standards.  The other party in the
> communication needs to make least common denominator assumptions
> about supported functionality, subject to further checks that may
> provide additional information (e.g., iSCSI text key negotiation,
> read a particular SCSI VPD page).
>
>> Section 8 makes a bold claim that needs substantiation.  Given the
>> global nature of task identifiers, is it necessary to prevent one user
>> from querying a task that another created?
>
> No it is not - to the extent that iSCSI has a notion of user, that
> notion is session-wide across all connections.
>
>> What about triggering reset on a I_T nexus that another user is using?
>
> The I_T NEXUS RESET function in the -sam draft can't do that because that
> function is reflexive - it resets the I_T nexus on which the function is
> sent.  Yes, this really is a SCSI version of "shoot yourself in the foot"
> :-) and it is useful because target systems generally respond by cleaning
> out all of the state related to the nexus which can clean up some situations.
>
> Nonetheless, there are already SCSI-defined means in iSCSI to take
> out other I_T nexuses, and the security measures for those (primarily
> configuration-based access control) are handled at the SCSI level.
>
> Something else to keep in mind is that the usual notion of "user" as a person
> does not apply to SCSI and iSCSI.  The initiator is typically an operating
> system driver or iSCSI offload HBA, and hence the identity is a system
> identity as opposed to a user identity.
>
>> Questions (for -cons):
>>
>> Section 6.2 makes the following requirement:
>>    If the connection is still active (it is not undergoing an
>>    implicit or explicit logout), QUERY TASK MUST be issued on the
>>    same connection to which the task to be queried is allegiant at
>>    the time the Task Management request is issued. If the
>>    connection is implicitly or explicitly logged out (i.e., no other
>>    request will be issued on the failing connection and no other
>>    response will be received on the failing connection), then a
>>    QUERY TASK function request may be issued on another connection.
>>    This Task Management request will then establish a new allegiance
>>    for the command being queried.
>>
>> The comment is probably more for of -cons (Section 4.2.5.1 and Section
>> 7.2.2).  Obviously, this design is long-standing, and I'm only reading
>> parts of the specification.
>
> And there's a lot of "running code" that uses this design.
>
>> There is an implication that tasks have identification that is
>> globally unique, even if the normal behaviour is to bind (ally) a
>> request with a connection.  The reasons for allowing this are not
>> explained, but it imposes some fragility on the system.  For instance,
>> the requirement that the old connection be closed with a "remove the
>> connection for recovery" seems counter to the goal of failure
>> recovery.  If the point of this is for failure recovery, then a
>> primary failure mode would be network failure - in which case this
>> mechanism cannot be used.
>
> There are multiple notions of "global" involved here.  Task identity is
> global across the session, but not across independent sessions.  SCSI
> multipathing (e.g., as used with SANs) involves independent iSCSI sessions
> across which there is no shared iSCSI state.  The primary motivation for
> this iSCSI failure recovery mechanism is partial network failure, e.g., the
> two TCP connections are assigned to two different server NICs on the same
> network, and one of those NICs fails.  Complete network failure is not
> a motivation, as that would be handled at a higher level, e.g., by
> SAN-oriented SCSI multipathing across iSCSI sessions on different networks,
> and obviously, there's not much that can be done if all network connectivity
> is lost between a host and its storage.
>
>> What purpose does this allegiance feature address?
>
> Isolates connection failure effects to I/Os that use the connection as
> opposed to making them global.  It's not ideal, in that the CmdSN window
> will likely run out by the time that the failure is detected and recovered
> from, but at least failure of one connection doesn't stop all I/Os in
> their tracks.  This also avoids a functionality shift when moving from
> one-TCP-connection sessions (which are common) to multiple-TCP-connection
> sessions.
>
>> What are the security implications of using allegiance?
>
> Not much that I can immediately think of.  For any individual I/O, the
> state and data are on a single connection, but security considerations
> generally do not enter into connection selection - load balancing tends
> to be the primary consideration.
>
>> Nits:
>>
>> I don't know what tooling was used to generate this document, but
>> there are some strange artifacts.  In particular, diagrams and tables
>> are misaligned in a number of places.
>
> Guilty as charged (sorry), and there's actually a mistake in the mapping
> table courtesy of another tool glitch.  A more careful check will be done
> on the next version before it's submitted.  This is partly my fault, as
> I leaned on the editor to get this version submitted with a text correction,
> not realizing that the tooling would take advantage of the resulting haste
> to make waste ...
>
>> There are a few terms like "I_T nexus" that are not explained prior to
>> use.  These are defined in -cons.  However, I find the existence of a
>> terminology mapping table in this draft to be strange.  Wouldn't that
>> table be more useful in the "core" document?
>
> Good suggestion, I've cc:'d the author of the -cons draft, and we'll
> figure this out in consultation with the WG.
>
>>
>> Section 4.1:
>>
>> This seems unnecessarily convoluted:
>>
>>       Note that an operational value of "2" or higher for this key on
>>       an iSCSI session does not influence the SCSI level features in
>>       any way on that I_T nexus. An operational value of "2" or higher
>>       for this key permits the iSCSI-related features defined in this
>>       document to be used on all connections on this iSCSI session.
>>       SCSI level hand-shakes (e.g. commands, mode pages) eventually
>>       determine the existence or lack of various SAM-5 features
>>       available for the I_T nexus between the two SCSI end points). To
>>       summarize, negotiation of this key to "2" or higher is a
>>       necessary but not a sufficient condition of SAM-5 compliant
>>       feature usage at the SCSI protocol level.
>>
>> How about:
>>
>>   In addition to negotiating a value of "2" or higher, support for an individual
>>   MUST also be signaled using SCSI level hand-shakes prior to use.
>>
>> I know that adds 2119 language, but if the goal is to prevent someone
>> from attempting to use a feature prior to getting positive
>> confirmation of support, then this should be ok.
>
> I think the addition of a "MUST" here is fine, and we'll see about crafting some
> language that better reflects the SCSI aspects, as "hand-shakes" is actually
> not a common SCSI term.
>
>> Section 6.2 describes multiple tasks.  It would be good if the
>> description of each of the tasks were given a sub-section so that they
>> could be cross referenced and more readily found.
>
> Unfortunately, RFC 3720 and the consolidated draft were not written that way,
> and we'd prefer to keep that structure.
>
>>
>> --Martin
>> _______________________________________________
>> Gen-art mailing list
>> Gen-art@ietf.org
>> https://www.ietf.org/mailman/listinfo/gen-art
>