Re: [storm] SPC-2 reserve/release: Proposed text - version 4

"Black, David" <david.black@emc.com> Wed, 26 September 2012 14:51 UTC

Return-Path: <david.black@emc.com>
X-Original-To: storm@ietfa.amsl.com
Delivered-To: storm@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E5CB321F861D for <storm@ietfa.amsl.com>; Wed, 26 Sep 2012 07:51:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -102.599
X-Spam-Level:
X-Spam-Status: No, score=-102.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599, USER_IN_WHITELIST=-100]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ZPtmLiSeM9h0 for <storm@ietfa.amsl.com>; Wed, 26 Sep 2012 07:51:36 -0700 (PDT)
Received: from mexforward.lss.emc.com (hop-nat-141.emc.com [168.159.213.141]) by ietfa.amsl.com (Postfix) with ESMTP id 2E91721F85F4 for <storm@ietf.org>; Wed, 26 Sep 2012 07:51:35 -0700 (PDT)
Received: from hop04-l1d11-si01.isus.emc.com (HOP04-L1D11-SI01.isus.emc.com [10.254.111.54]) by mexforward.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id q8QEpQva002271 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Wed, 26 Sep 2012 10:51:33 -0400
Received: from mailhub.lss.emc.com (mailhub.lss.emc.com [10.254.221.251]) by hop04-l1d11-si01.isus.emc.com (RSA Interceptor); Wed, 26 Sep 2012 10:51:08 -0400
Received: from mxhub20.corp.emc.com (mxhub20.corp.emc.com [10.254.93.49]) by mailhub.lss.emc.com (Switch-3.4.3/Switch-3.4.3) with ESMTP id q8QEp61E029895; Wed, 26 Sep 2012 10:51:06 -0400
Received: from mx15a.corp.emc.com ([169.254.1.83]) by mxhub20.corp.emc.com ([10.254.93.49]) with mapi; Wed, 26 Sep 2012 10:51:05 -0400
From: "Black, David" <david.black@emc.com>
To: "Knight, Frederick" <Frederick.Knight@netapp.com>, "storm@ietf.org" <storm@ietf.org>
Date: Wed, 26 Sep 2012 10:51:04 -0400
Thread-Topic: SPC-2 reserve/release: Proposed text - version 4
Thread-Index: Ac2bNKguGPJW+zTbTTC2uDGdWEV2OAAGl/vwAAGNYXAAA3tHQAAIrqHwABYKaSAAA1Dz0A==
Message-ID: <8D3D17ACE214DC429325B2B98F3AE7120DE7969E@MX15A.corp.emc.com>
References: <8D3D17ACE214DC429325B2B98F3AE7120DE79578@MX15A.corp.emc.com> <E160851FCED17643AE5F53B5D4D0783A451DBBB6@BL2PRD0610MB361.namprd06.prod.outlook.com> <FFEE311F0EC80C4EA64EC8757CAD0CD208EA63C2@SACEXCMBX04-PRD.hq.netapp.com>
In-Reply-To: <FFEE311F0EC80C4EA64EC8757CAD0CD208EA63C2@SACEXCMBX04-PRD.hq.netapp.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-EMM-MHVC: 1
Subject: Re: [storm] SPC-2 reserve/release: Proposed text - version 4
X-BeenThere: storm@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Storage Maintenance WG <storm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/storm>, <mailto:storm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/storm>
List-Post: <mailto:storm@ietf.org>
List-Help: <mailto:storm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/storm>, <mailto:storm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 26 Sep 2012 14:51:38 -0000

Fred,

> I don't see anywhere that these timers are associated with any SCSI state, or
> with the RESERVE state.  Attempting to tie RESERVE state to these timers is a
> NEW requirement.  Our goal here was specifically to NOT create any new
> requirements.

I think you have a point that this is too specific.  What if we just delete
mention of those timers?  The bullet in question could then simply say:

  - When connection failure causes the iSCSI session to fail, and
      the session is not reinstated or continued, target retention
      of that session's reserve/release reservation state for an
      extended period of time may require the initiator to issue a
      reset (e.g., LOGICAL UNIT RESET, see section 11.5) in order
      to remove that reservation state.

That captures the primary concern that I wanted to alert implementers to.

> We do not want to impact existing iSCSI implementations.  Could we associate
> the action that occurred with the observable behavior the host sees?  I don't
> like putting SCSI layer stuff in a transport layer spec, but it should just be
> informative.

The text that follows explicitly lists the SCSI ASC/Q values (e.g., 29/07, in
hex).  That's something I'd prefer to avoid, e.g., as T10 could easily add
additional ASC/Q values that are relevant to this situation.

As an alternative, I suggest that we state the principle and point to SAM-4
for the details, e.g., add the following text to the end of 4.4.3.1:

   The specific Unit Attention condition that results from session
   re-establishment indicates whether or not nexus state was preserved,
   see Section 6.3.4 of [SAM4].

Would that be ok?

Thanks,
--David

> -----Original Message-----
> From: Knight, Frederick [mailto:Frederick.Knight@netapp.com]
> Sent: Wednesday, September 26, 2012 8:39 AM
> To: Mallikarjun Chadalapaka; Black, David; storm@ietf.org
> Subject: RE: SPC-2 reserve/release: Proposed text - version 4
>
> I'm concerned about this new link between Time2Wait and RESERVE state.  Such a
> link has not previously existed.  I don't think we should create such a
> linkage.
>
> Section 4.6.3.3 includes:
> .... In addition, the Logout Response indicates how long the target will
> continue to hold resources for recovery (e.g., command execution that
> continues on a new connection)....
>
> This and many other places where Time2Wait and Time2Retain are discussed, the
> points all have to do with command continuation; no mention of I_T state, or
> RESERVE state, or mode pages, or anything else.
>
> Section 7.5 says:
> .... Time2Wait is the initial "respite time" before attempting an
> explicit/implicit Logout for the CID in question or task reassignment for the
> affected tasks (if any)....
>
> Or, task reassignment of affected tasks; with no mention of I_T state.
>
> Section 7.6 ties the termination of tasks to the Time2Wait / Time2Retain
> timers.  Section 11.14.5 makes the same statements about the task termination
> linkage to those timers.
>
> Section 11.15.3 says those timers are "...the minimum amount of time, in
> seconds, to wait before attempting task reassignment.".
>
> Section 8.3 defines Session State.  And there is one sentence that I can find
> in all 345 pages that associates that session state with these timers.  It is
> in section 11.15.4 where it says: If it is the last connection of a session,
> the whole session state is discarded after Time2Retain.
>
> I don't see anywhere that these timers are associated with any SCSI state, or
> with the RESERVE state.  Attempting to tie RESERVE state to these timers is a
> NEW requirement.  Our goal here was specifically to NOT create any new
> requirements.
>
> We do not want to impact existing iSCSI implementations.  Could we associate
> the action that occurred with the observable behavior the host sees?  I don't
> like putting SCSI layer stuff in a transport layer spec, but it should just be
> informative.  Something like:
>
> If the initiator receives a UA: 29/00 or 29/01 or 29/04 then the RESERVE state
> has been lost.  (Maybe 29/04 belongs in a RESERVE state unknown list).
>
> If the initiator receives a UA: 29/07 then the RESERVE state has been
> preserved.
>
> I didn't list 29/02 or 29/03 or 29/05 or 29/06 since I view those as Parallel
> SCSI specific.  We can debate which items belong in which list, but would this
> kind of observable behavior approach be better than a new requirement?  FYI -
> I didn't invent this; I copied it from SAM-5r11 (sub-clause 6.3.4) where it
> states it as a requirement on the target -  that when an I_T NEXUS LOSS event
> occurs, if the state is retained, 29/07 is returned, and if the state is not
> retained, then 29/01 is returned.
>
> We also don't want to create any new requirement now in this consolidated
> draft that conflicts with this SAM-5 text, since we will eventually have to
> comply with SAM-5 and its follow-ons.
>
> Another approach would be to ignore this for the consolidated draft, and force
> it into the iSCSI-SAM draft - which references SAM-4 and SAM-5 where I_T Nexus
> loss is already defined and the above text already exists.
>
>       Fred
>
> -----Original Message-----
> From: storm-bounces@ietf.org [mailto:storm-bounces@ietf.org] On Behalf Of
> Mallikarjun Chadalapaka
> Sent: Tuesday, September 25, 2012 9:39 PM
> To: Black, David; storm@ietf.org
> Subject: Re: [storm] SPC-2 reserve/release: Proposed text - version 4
>
> Looks good to me overall. Two minor comments:
>
> 1) For 4.4.3.1, I have replaced the current sentence with the following one
> (instead of the proposed):
>
> That is, it should reinstate the session via iSCSI session reinstatement
> (Section 6.3.5) or continue via session continuation (Section 6.3.6).
>
> 2) For proposed new text, I have tweaked the version 4 sentence below, keeping
> the existing implementations in view:
>
> VERSION 4:
> The Time2Wait and Time2Retain
>       timeout values (see Section 7.55) apply to retention of reserve/release
>       reservation state after iSCSI session failure because that state
>       is part of the I_T Nexus state.
>
> VERSION 5:
> Instead, implementations are strongly encouraged to apply the
>       Time2Wait and Time2Retain timeout values (see Section 7.5)
>       to manage retention of reserve/release reservation state after
>      iSCSI session failure because that state is part of the I_T nexus state.
>
>
> Thanks.
>
> Mallikarjun
>
>
> -----Original Message-----
> From: storm-bounces@ietf.org [mailto:storm-bounces@ietf.org] On Behalf Of
> Black, David
> Sent: Tuesday, September 25, 2012 2:39 PM
> To: storm@ietf.org
> Subject: [storm] FW: SPC-2 reserve/release: Proposed text - version 4
>
> version 4 changes:
>
> More changes for Mallikarjun's comments, mostly a rewrite of the first
> additional caveat bullet.  I've chosen not to reference Annex B of
> SPC-4 as that seems a bit far afield for a SCSI transport standard like iSCSI.
>
> version 3 changes:
>
> Responding to Mallikarjun's comments (plus some more edits) -
>
> > 1) IMHO, opening sentence of "I_T nexus state includes reservation state"
> > is a little too broad,
>
> I removed that sentence, removed I_T Nexus from new section title and tweaked
> the persistent reservation text to talk about state in general as opposed to
> I_T Nexus state in particular.
>
> > 2) Hard reset is not an LU reset - it is either an iSCSI Target Warm
> > Reset, or (more likely) an iSCSI Target Cold Reset. In either case,
> > any of the three resets should clear the reserve/release state.
>
> I removed the word "hard" - as you noted, an LU reset is sufficient to remove
> a reserve/release reservation.
>
> > 3) Should use Time2Retain+Time2Wait, not Default Time2Wait
>
> Brilliant minds think alike - I got to that one in version 2 ;-).
>
> > 4) Session recovery does not necessarily preserve the I_T nexus state
> > (recovery may not occur within Time2Wait+Time2Retain, and may not use
> > the same ISID when it occurs), but session reinstatement and session
> > continuation certainly should.
>
> I changed "session recovery" to "session reinstatement".
>
> > 5) Should articulate the underlying linkage between Time2Retain (an
> > iSCSI concept), and the reserve/release state (a SCSI state).
>
> I believe I also got to that one in version 2.
>
> version 2 changes:
>
> I moved a paragraph break in response to Ralph's comments, and rewrote the
> first additional consideration to reference the connection timeout values in
> general.  I also found some minor problems that need correction in Section
> 4.6.3.3 (see end of this message).
>
> --- version 4 follows ---
>
> <WG chair hat off>
>
> Here's an initial attempt to propose text to deal with the reserve/ release
> topic in the consolidated iSCSI draft.  Please comment, correct, suggest
> edits, etc.
>
> The new text would be a new section 4.4.3.2 that would follow 4.4.3.1:
>
> 4.4.3.1. I_T Nexus State
>
>   Certain nexus relationships contain an explicit state (e.g.,
>   initiator-specific mode pages) that may need to be preserved by
>   the device server [SAM2] in a logical unit through changes or
>   failures in the iSCSI layer (e.g., session failures). In order for
>   that state to be restored, the iSCSI initiator should reestablish
>   its session (re-login) to the same Target Portal Group using the
>   previous ISID. That is, it should perform session recovery as
>   described in Chapter 6. This is because the SCSI initiator port
>   identifier and the SCSI target port identifier (or relative target
>   port) form the datum that the SCSI logical unit device server uses
>   to identify the I_T nexus.
>
> -------------------------------
>
> First, for clarity, the following change should be made in the above 4.4.3.1
> text to better identify the recovery mechanism:
>
> OLD
>   That is, it should perform session recovery as
>   described in Chapter 6.
> NEW
>   That is, it should recover the session via connection
>   reinstatement as described in Section 6.3.4.
> END
>
> I believe that a lower-case "should" remains appropriate for this text.
>
> -----------------------------------
>
> NEW TEXT (second draft):
>
> 4.4.3.2. Reservations
>
>   There are two reservation management methods defined in the SCSI
>   standards, reserve/release reservations, based on the RESERVE and
>   RELEASE commands [SPC2], and persistent reservations, based on the
>   PERSISTENT RESERVE IN and PERSISTENT RESERVE OUT commands [SPC3].
>   Reserve/release reservations are obsolete [SPC3] and SHOULD NOT be
>   used; persistent reservations SHOULD be used instead.
>
>   State for persistent reservations is required to persist
>   through changes and failures at the iSCSI layer that result in
>   I_T Nexus failures, see [SPC3] for details and specific requirements.
>
>   In contrast, [SPC2] does not specify detailed persistence
>   requirements for reserve/release reservation state after an I_T
>   Nexus failure.  Nonetheless, when reserve/release reservations are
>   supported by an iSCSI target, the preferred implementation approach
>   is to preserve reserve/release reservation state as part of the
>   I_T Nexus state for iSCSI session reinstatement (see Section 6.3.5)
>   or session continuation (see Section 6.3.6).
>
>   Two additional caveats apply to reserve/release reservations:
>
>   - When connection failure causes the iSCSI session to fail, and
>       the session is not reinstated or continued, target retention
>       of that session's reserve/release reservation state for an
>       extended period of time may require the initiator to issue a
>       reset (e.g., LOGICAL UNIT RESET, see section 11.5) in order
>       to remove that reservation state.  The Time2Wait and Time2Retain
>       timeout values (see Section 7.55) apply to retention of reserve/release
>       reservation state after iSCSI session failure because that state
>       is part of the I_T Nexus state.  The need for resets in this
>       scenario may be reduced by suitable selection of values for
>       these two timeouts.
>
>   - Reserve/release reservations may not behave as expected when
>       persistent reservations are also used on the same logical unit;
>       see the discussion of "Exceptions to SPC-2 RESERVE and RELEASE
>       behavior" in [SPC4].
>
> -----------------------
>
> Reference impacts: both [SPC2] and [SPC3] need to be normative references, but
> [SPC4] can be an informative reference.
>
> -------------------------
>
> I deliberately used "the preferred iSCSI implementation approach"
> wording for reserve/release reservation state preservation requirements, as
> SPC-2 is vague on this topic and I'm rather uncomfortable with placing a
> requirement as strong as a "SHOULD" on an obsolete mechanism that "SHOULD NOT"
> be used.
>
> ---------------
>
> New problem in Section 4.6.3.3 - in this text:
>
>    The Logout response indicates that the connection or session
>    cleanup is completed and no other responses will arrive on the
>    connection (if received on the logging out connection). In
>    addition, the Logout Response indicates how long the target will
>    continue to hold resources for recovery (e.g., command execution
>    that continues on a new connection) in the text key Time2Retain
>    and how long the initiator must wait before proceeding with
>    recovery in the text key Time2Wait.
>
> Time2Wait and Time2Retain are not text keys - they're binary fields in the
> Logout Response PDU.  Two changes are in order:
>       - "text key Time2Retain" -> "Time2Retain field"
>       - "text key Time2Wait" -> "Time2Wait field"
>
> Thanks,
> --David
> ----------------------------------------------------
> David L. Black, Distinguished Engineer
> EMC Corporation, 176 South St., Hopkinton, MA  01748
> +1 (508) 293-7953              FAX: +1 (508) 293-7786
> david.black@emc.com        Mobile: +1 (978) 394-7754
> ----------------------------------------------------
>
> _______________________________________________
> storm mailing list
> storm@ietf.org
> https://www.ietf.org/mailman/listinfo/storm
>
> _______________________________________________
> storm mailing list
> storm@ietf.org
> https://www.ietf.org/mailman/listinfo/storm
>
> _______________________________________________
> storm mailing list
> storm@ietf.org
> https://www.ietf.org/mailman/listinfo/storm
>
> _______________________________________________
> storm mailing list
> storm@ietf.org
> https://www.ietf.org/mailman/listinfo/storm
>
>
> _______________________________________________
> storm mailing list
> storm@ietf.org
> https://www.ietf.org/mailman/listinfo/storm