Re: [storm] iSER update request from Bob Russell

<Black_David@emc.com> Sat, 17 October 2009 00:39 UTC

Return-Path: <Black_David@emc.com>
X-Original-To: storm@core3.amsl.com
Delivered-To: storm@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id ECCD23A683E for <storm@core3.amsl.com>; Fri, 16 Oct 2009 17:39:40 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.949
X-Spam-Level:
X-Spam-Status: No, score=-5.949 tagged_above=-999 required=5 tests=[AWL=0.649, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id uyobeN3urrxE for <storm@core3.amsl.com>; Fri, 16 Oct 2009 17:39:39 -0700 (PDT)
Received: from mexforward.lss.emc.com (mexforward.lss.emc.com [128.222.32.20]) by core3.amsl.com (Postfix) with ESMTP id 4C4B23A6407 for <storm@ietf.org>; Fri, 16 Oct 2009 17:39:38 -0700 (PDT)
Received: from hop04-l1d11-si04.isus.emc.com (HOP04-L1D11-SI04.isus.emc.com [10.254.111.24]) by mexforward.lss.emc.com (Switch-3.3.2/Switch-3.1.7) with ESMTP id n9H0cGBT025446 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 16 Oct 2009 20:38:16 -0400
Received: from mailhub.lss.emc.com (nagas.lss.emc.com [10.254.144.15]) by hop04-l1d11-si04.isus.emc.com (RSA Interceptor); Fri, 16 Oct 2009 20:38:10 -0400
Received: from corpussmtp5.corp.emc.com (corpussmtp5.corp.emc.com [128.221.166.229]) by mailhub.lss.emc.com (Switch-3.3.2mp/Switch-3.3.2mp) with ESMTP id n9H0c9W9009347; Fri, 16 Oct 2009 20:38:09 -0400
Received: from CORPUSMX80A.corp.emc.com ([10.254.89.202]) by corpussmtp5.corp.emc.com with Microsoft SMTPSVC(6.0.3790.3959); Fri, 16 Oct 2009 20:38:09 -0400
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CA4EC2.1961BDBA"
Date: Fri, 16 Oct 2009 20:38:08 -0400
Message-ID: <9FA859626025B64FBC2AF149D97C944A040BF6DA@CORPUSMX80A.corp.emc.com>
In-Reply-To: <E265A5696240423CB3579D112E454FDB@china.huawei.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [storm] iSER update request from Bob Russell
Thread-Index: Acoh5RBcsDJYU9J2RJKjhUKC/xsL6As2+Q9w
References: <E265A5696240423CB3579D112E454FDB@china.huawei.com>
From: Black_David@emc.com
To: Michael@huaweisymantec.com, storm@ietf.org
X-OriginalArrivalTime: 17 Oct 2009 00:38:09.0146 (UTC) FILETIME=[1978F5A0:01CA4EC2]
X-EMM-EM: Active
Subject: Re: [storm] iSER update request from Bob Russell
X-BeenThere: storm@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Storage Maintenance WG <storm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/storm>, <mailto:storm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/storm>
List-Post: <mailto:storm@ietf.org>
List-Help: <mailto:storm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/storm>, <mailto:storm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 17 Oct 2009 00:39:41 -0000

There's a lot of material in here.  With WG chair hat off, my
opinions are:
 
- If all the implementations start the connection immediately
    in RDMA mode, then the RFC should be revised to reflect that
    "running code".  I do hope the initial MPA Request and Reply
    frames are being used.
- If all the implementations are using the Expanded iSER header,
    then the RFC ought to be revised to reflect "running code".
- ZBVA was originally left out of the RFC because it was thought
    to be IB-specific.  If that concept also applies to iWARP,
    it belongs in the new iSER RFC ("running code" again).

Thanks,
--David


 


________________________________

	From: storm-bounces@ietf.org [mailto:storm-bounces@ietf.org] On
Behalf Of Mike Ko
	Sent: Thursday, August 20, 2009 6:25 PM
	To: STORM
	Subject: [storm] iSER update request from Bob Russell
	
	
	On March 22, 2009, Bob Russell posted the following at the IPS
mailing list.  I have embedded my responses below prefixed by <mk>.
	 
	Mike
	 
	-----  from Bob Russell  -----
	 
	There are 2 issues I would like to suggest for discussion at the
BOF
	meeting later this week.  Both have to do with the iSER spec,
RFC 5046.
	 
	1. At the present time, as far as I know, no existing hardware,
	   neither Infiniband nor iWARP, is capable of opening a
connection
	   in "normal" TCP mode and then transitioning it into zero-copy
mode.
	   Unfortunately, the iSER spec requires that.
	   Can't we just replace that part of the iSER spec?
	   Otherwise, all hardware and all implementations are
non-standard.
	 
	<mk> When I wrote the Supplement to Infiniband Architecture
Specification Annex A12 (Support for iSCSI Extensions for RDMA), there
was no mention of transitioning the connection from TCP mode to RDMA
mode.  I can update update RFC 5046 to remove this requirement if that
is the consensus of the group.
	 
	2. The OFED stack is used to access both Infiniband and iWARP
hardware.
	   This software requires 2 extra 64-bit fields for addressing
	   on both Infiniband and iWARP hardware, but these fields
	   are not allowed for in the current iSER Header Format.
	   Can't we just add those extra fields to the iSER spec?
	   If someday some other implementation doesn't need those
	   fields, they can be just set to 0 (which is what is implied
by
	   the current iSER standard anyway).  Again, by not doing this,
	   all implementations are non-standard.
	 
	<mk> I assume you are suggesting that we should take the
"Expanded iSER Header for Supporting Virtual Address" as defined in
table 4 of Annex A12 and update RFC 5046 accordingly.  Again, I am fine
with this if that is the consensus of the group.
	 
	In other words, I'm suggesting that we consider replacing the
relevant
	parts of the current iSER specs with the current OFED specs on
these
	2 issues.
	 
	Thanks for your consideration,
	Bob Russell
	 
	Note: The following (old) posting by Mike Ko states that the
	extra header fields are needed only by IB, not by IETF
	(i.e., iWARP), because IB uses nZBVA, whereas iWARP uses ZBVA.
	 
	But are there any IETF/iWARP implementations out there that
actually
	use ZBVA with iWARP RNICs?  (I don't mean software simulations
of
	the iWARP protocol.)  We have built an iSER implementation that
	uses the OFED stack to access both IB and iWARP hardware, and
for
	both of them we need to use the extra iSER header fields
(nZBVA).
	Perhaps this is an issue with the design of the OFED stack,
which
	was built primarily to access IB hardware and therefore reflects
	the needs of the IB hardware.  But we found that the only way to
	access iWARP hardware via the OFED stack was to used the
expanded
	(nZBVA) iSER header (and to use a meaningful value in the extra
field,
	NOT to just set it to zero).
	 
	In any case, rather than have 2 different versions of the iSER
header,
	it would be better to have just one, regardless of the
underlying
	technology involved (after all, isn't that what a standard is
for??).
	This is especially relevant when using the OFED stack, because,
	as we have demonstrated, software built on top of the OFED stack
can
	(AND SHOULD!) be able to run with EITHER IB or iWARP hardware,
	with NO change to that software.  Having 2 different iSER
headers
	does NOT make that possible!
	 
	 
	 
	    2008/4/15 Mike Ko <mako at almaden.ibm.com>:
	 
	    VA is a concept introduced in an Infiniband annex to support
iSER. It
	    appears in the expanded iSER header for Infiniband use only
to support the
	    non-Zero Based Virtual Address (non-ZBVA) used in Infiniband
vs the ZBVA
	    used in IETF.
	 
	Mike - could you please put me in contact with someone who has
actually
	implemented iSER on top of IETF/iWARP hardware NICs using ZBVA?
	 
	<mk> Perhaps vendors who have built iSER stacks can comment on
this.
	

	    "The DataDescriptorOut describes the I/O buffer starting
with the immediate
	    unsolicited data (if any), followed by the non-immediate
unsolicited data
	    (if any) and solicited data." If non-ZBVA mode is used, then
VA points to
	    the beginning of this buffer. So in your example, the VA
field in the
	    expanded iSER header will be zero. Note that for IETF, ZBVA
is assumed and
	    there is no provision to specify a different VA in the iSER
header.
	 
	Mike - I believe this VA field in the expanded iSER header is
almost
	NEVER zero -- it is always an actual virtual address.
	 
	<mk> So if the OFED iSER stack is based on Annex 12, then it
already has the means to select ZBVA or non-ZBVA as specified in the
"iSER CM REQ Message Private Data Format" of table 2.  So rather than
having to change the OFED implementation and update the Infiniband
Annex, I suggest that we leave the ZBVA/non-ZBVA option alone even
though ZBVA is never used as you said.

	    Tagged offset (TO) refers to the offset within a tagged
buffer in RDMA Write
	    and RDMA Read Request Messages. When sending non-immediate
unsolicited
	    data, Send Message types are used and the TO field is not
present. Instead,
	    the buffer offset is appropriately represented by the Buffer
Offset field in
	    the SCSI Data-Out PDU. Note that Tagged Offset is not the
same as write VA
	    and it does not appear in the iSER header.
	 
	    Mike