RE: [Ips] detection of failed sessions to allow re-login

"Sandars, Ken" <ken_sandars@adaptec.com> Fri, 20 April 2007 00:56 UTC

Return-path: <ips-bounces@ietf.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1HehPx-00033b-NV; Thu, 19 Apr 2007 20:56:05 -0400
Received: from ips by megatron.ietf.org with local (Exim 4.43) id 1HehPw-00033V-IE for ips-confirm+ok@megatron.ietf.org; Thu, 19 Apr 2007 20:56:04 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1HehPw-00033N-8c for ips@ietf.org; Thu, 19 Apr 2007 20:56:04 -0400
Received: from mail-gw3.adaptec.com ([216.52.22.36]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1HehPv-0008L8-Ju for ips@ietf.org; Thu, 19 Apr 2007 20:56:04 -0400
Received: from aime2k302.adaptec.com (aime2k302.adaptec.com [10.25.8.48]) by mail-gw3.adaptec.com (Spam Firewall) with ESMTP id B04D21909D7; Thu, 19 Apr 2007 17:56:02 -0700 (PDT)
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Subject: RE: [Ips] detection of failed sessions to allow re-login
Date: Thu, 19 Apr 2007 17:56:00 -0700
Message-ID: <368FBF3D8437A748BA8222526BF9309901ACB8A3@aime2k302.adaptec.com>
In-reply-to: <16236EEEF4D4264DA31C2E35E3607CFE08906E1B@coex02.trans.corp>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [Ips] detection of failed sessions to allow re-login
Thread-Index: AceC2Geaj9izq1egS867H12pdfy0iwADTTCA
References: <16236EEEF4D4264DA31C2E35E3607CFE08906E1B@coex02.trans.corp>
From: "Sandars, Ken" <ken_sandars@adaptec.com>
To: Paul Hughes <phughes@pillardata.com>, ips@ietf.org
X-Spam-Score: 0.1 (/)
X-Scan-Signature: 501044f827b673024f6a4cb1d46e67d2
Cc:
X-BeenThere: ips@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: IP Storage <ips.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ips>, <mailto:ips-request@ietf.org?subject=unsubscribe>
List-Post: <mailto:ips@ietf.org>
List-Help: <mailto:ips-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ips>, <mailto:ips-request@ietf.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============0753463506=="
Errors-To: ips-bounces@ietf.org

Hi Paul,
 
That's a target problem, most likely a bug. The second login with TSIH=0
tells the target to perform session reinstatement. That's jargon for
"silently nuke the first session".
 
The target may be failing the login because it's internal cleanup
requires more time (I/O requests from the previous session are jammed
for instance). Your scenario suggests this is not likely.
 
HTH,
Ken

________________________________

From: Paul Hughes [mailto:phughes@pillardata.com] 
Sent: Friday, 20 April 2007 09:14
To: ips@ietf.org
Subject: [Ips] detection of failed sessions to allow re-login


I have a question about how a target can quickly detect session failures
so that a re-login can succeed.
 
Here's my scenario:
 
1) an initiator is booting from an iSCSI target
2) the initiator is using an iSCSI HBA to communicate with the iSCSI
target
3) the HBA BIOS creates the first session, discovers the boot LUN, and
reads the boot loader
4) the boot loader reads the kernel from the boot LUN
5) the kernel resets the iSCSI HBA while loading an HBA driver
6) the HBA driver attempts to create a new session
 
The problem I'm seeing is that the target is failing the login for the
new session because the target thinks the first session created by the
HBA BIOS is still valid (not in failed state).  The HBA reset was not
detected by the target soon enough for the target to know that the first
session is now in the failed state when the initiator attempts to login
and create the second session using the same InitiatorName, ISID,
TargetName, and TargetPortalGroupTag as the first session (with TSIH=0).
The target does not see a link down event because a switch is connected
between the HBA and the target port.  The target eventually detects that
the first session is failed when it sends a NOP-Out PDU and receives a
transport failure.  Unfortunately, this occurs too late and the boot
fails.
 
In my case the target is sending NOP-Out PDUs every 60 seconds.  I can
change that to 5 seconds, but I don't think that will fix every case.
Is there a better way for the target to determine that the first session
has failed so that a re-login will succeed on the first try?
 
Thanks,
Paul
 
 
 
_______________________________________________
Ips mailing list
Ips@ietf.org
https://www1.ietf.org/mailman/listinfo/ips