Re: [storm] iSER update request from Bob Russell

"Robert D. Russell" <rdr@iol.unh.edu> Mon, 09 November 2009 22:04 UTC

Return-Path: <rdr@iol.unh.edu>
X-Original-To: storm@core3.amsl.com
Delivered-To: storm@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 933143A69A4 for <storm@core3.amsl.com>; Mon, 9 Nov 2009 14:04:58 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Arl5qFmcS0g2 for <storm@core3.amsl.com>; Mon, 9 Nov 2009 14:04:57 -0800 (PST)
Received: from postal.iol.unh.edu (postal.iol.unh.edu [132.177.123.84]) by core3.amsl.com (Postfix) with ESMTP id 421493A69C7 for <storm@ietf.org>; Mon, 9 Nov 2009 14:04:57 -0800 (PST)
Received: by postal.iol.unh.edu (Postfix, from userid 607) id 6B2AE1DD11B; Mon, 9 Nov 2009 17:05:22 -0500 (EST)
Received: from localhost (localhost [127.0.0.1]) by postal.iol.unh.edu (Postfix) with ESMTP id 5A032264CD8; Mon, 9 Nov 2009 17:05:22 -0500 (EST)
Date: Mon, 09 Nov 2009 17:05:22 -0500
From: "Robert D. Russell" <rdr@iol.unh.edu>
To: Felix Marti <felix@chelsio.com>
In-Reply-To: <8A71B368A89016469F72CD08050AD33406B4B0E3@maui.asicdesigners.com>
Message-ID: <alpine.LNX.2.00.0911091704300.28145@postal.iol.unh.edu>
References: <E265A5696240423CB3579D112E454FDB@china.huawei.com><9FA859626025B64FBC2AF149D97C944A040BF6DA@CORPUSMX80A.corp.emc.com> <BLU136-DS47B300208E821F9829B7FA0B20@phx.gbl> <8A71B368A89016469F72CD08050AD33406B4B0E3@maui.asicdesigners.com>
User-Agent: Alpine 2.00 (LNX 1167 2008-08-23)
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"; format="flowed"
Cc: storm@ietf.org, Black_David@emc.com
Subject: Re: [storm] iSER update request from Bob Russell
X-BeenThere: storm@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Storage Maintenance WG <storm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/storm>, <mailto:storm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/storm>
List-Post: <mailto:storm@ietf.org>
List-Help: <mailto:storm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/storm>, <mailto:storm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 09 Nov 2009 22:04:58 -0000

But can this feature be accessed via the OFED stack?
If so, could you please indicate how this can be done?

Thanks,
Bob Russell

On Mon, 2 Nov 2009, Felix Marti wrote:

> Chelsio iWarp products do support ZBVA.
>
>
>
> From: storm-bounces@ietf.org [mailto:storm-bounces@ietf.org] On Behalf
> Of Mallikarjun Chadalapaka
> Sent: Monday, November 02, 2009 6:02 PM
> To: Black_David@emc.com; Michael@huaweisymantec.com; storm@ietf.org
> Subject: Re: [storm] iSER update request from Bob Russell
>
>
>
> As I recall, I believe the ZBVA issue was the other way around.  When we
> defined it, ZBVA was assumed to be a native feature of all iWARP
> implementations.   The reasoning was the following as I recall:
>
>
>
> -          SCSI, Fibre Channel and iSCSI traditionally transfer only the
> buffer/block *offset* into the I/O address space for that I/O and that
> model had worked well in doing high-performance offload implementations.
> No reason for iWARP to support a different model requiring the base
> address exchange.
>
> -          Starting VA of an STag can be managed in the local state of
> an STag for a data source or data sink buffer, rather than transferring
> it across the wire to the other side and get it back.
>
> -           There's a value in keeping the iSER header small as its size
> dictates the size of each buffer in the anonymous buffer pool attached
> to an RQ.  Smaller size means that for a given size of a shared RQ
> buffer pool in a PD/session, a target implementation can support a
> larger iSCSI "queue depth" (larger MaxCmdSN).
>
>
>
> So given what I remember anyways, I am actually surprised about the
> assertion that no iWARP implementations support ZBVA addressing.
> Perhaps iWARP vendors can chime in here?
>
>
>
> Mallikarjun
>
>
>
> From: storm-bounces@ietf.org [mailto:storm-bounces@ietf.org] On Behalf
> Of Black_David@emc.com
> Sent: Friday, October 16, 2009 5:38 PM
> To: Michael@huaweisymantec.com; storm@ietf.org
> Subject: Re: [storm] iSER update request from Bob Russell
>
>
>
> There's a lot of material in here.  With WG chair hat off, my
>
> opinions are:
>
>
>
> - If all the implementations start the connection immediately
>
>    in RDMA mode, then the RFC should be revised to reflect that
>
>    "running code".  I do hope the initial MPA Request and Reply
>
>    frames are being used.
>
> - If all the implementations are using the Expanded iSER header,
>
>    then the RFC ought to be revised to reflect "running code".
>
> - ZBVA was originally left out of the RFC because it was thought
>
>    to be IB-specific.  If that concept also applies to iWARP,
>
>    it belongs in the new iSER RFC ("running code" again).
>
> Thanks,
> --David
>
>
>
>
>
> ________________________________
>
> 	From: storm-bounces@ietf.org [mailto:storm-bounces@ietf.org] On
> Behalf Of Mike Ko
> 	Sent: Thursday, August 20, 2009 6:25 PM
> 	To: STORM
> 	Subject: [storm] iSER update request from Bob Russell
>
> 	On March 22, 2009, Bob Russell posted the following at the IPS
> mailing list.  I have embedded my responses below prefixed by <mk>.
>
>
>
> 	Mike
>
>
>
> 	-----  from Bob Russell  -----
>
>
>
> 	There are 2 issues I would like to suggest for discussion at the
> BOF
> 	meeting later this week.  Both have to do with the iSER spec,
> RFC 5046.
>
>
>
> 	1. At the present time, as far as I know, no existing hardware,
> 	   neither Infiniband nor iWARP, is capable of opening a
> connection
> 	   in "normal" TCP mode and then transitioning it into zero-copy
> mode.
> 	   Unfortunately, the iSER spec requires that.
> 	   Can't we just replace that part of the iSER spec?
> 	   Otherwise, all hardware and all implementations are
> non-standard.
>
>
>
> 	<mk> When I wrote the Supplement to Infiniband Architecture
> Specification Annex A12 (Support for iSCSI Extensions for RDMA), there
> was no mention of transitioning the connection from TCP mode to RDMA
> mode.  I can update update RFC 5046 to remove this requirement if that
> is the consensus of the group.
>
>
>
> 	2. The OFED stack is used to access both Infiniband and iWARP
> hardware.
> 	   This software requires 2 extra 64-bit fields for addressing
> 	   on both Infiniband and iWARP hardware, but these fields
> 	   are not allowed for in the current iSER Header Format.
> 	   Can't we just add those extra fields to the iSER spec?
> 	   If someday some other implementation doesn't need those
> 	   fields, they can be just set to 0 (which is what is implied
> by
> 	   the current iSER standard anyway).  Again, by not doing this,
> 	   all implementations are non-standard.
>
>
>
> 	<mk> I assume you are suggesting that we should take the
> "Expanded iSER Header for Supporting Virtual Address" as defined in
> table 4 of Annex A12 and update RFC 5046 accordingly.  Again, I am fine
> with this if that is the consensus of the group.
>
>
>
> 	In other words, I'm suggesting that we consider replacing the
> relevant
> 	parts of the current iSER specs with the current OFED specs on
> these
> 	2 issues.
>
>
>
> 	Thanks for your consideration,
> 	Bob Russell
>
>
>
> 	Note: The following (old) posting by Mike Ko states that the
> 	extra header fields are needed only by IB, not by IETF
> 	(i.e., iWARP), because IB uses nZBVA, whereas iWARP uses ZBVA.
>
>
>
> 	But are there any IETF/iWARP implementations out there that
> actually
> 	use ZBVA with iWARP RNICs?  (I don't mean software simulations
> of
> 	the iWARP protocol.)  We have built an iSER implementation that
> 	uses the OFED stack to access both IB and iWARP hardware, and
> for
> 	both of them we need to use the extra iSER header fields
> (nZBVA).
> 	Perhaps this is an issue with the design of the OFED stack,
> which
> 	was built primarily to access IB hardware and therefore reflects
> 	the needs of the IB hardware.  But we found that the only way to
> 	access iWARP hardware via the OFED stack was to used the
> expanded
> 	(nZBVA) iSER header (and to use a meaningful value in the extra
> field,
> 	NOT to just set it to zero).
>
>
>
> 	In any case, rather than have 2 different versions of the iSER
> header,
> 	it would be better to have just one, regardless of the
> underlying
> 	technology involved (after all, isn't that what a standard is
> for??).
> 	This is especially relevant when using the OFED stack, because,
> 	as we have demonstrated, software built on top of the OFED stack
> can
> 	(AND SHOULD!) be able to run with EITHER IB or iWARP hardware,
> 	with NO change to that software.  Having 2 different iSER
> headers
> 	does NOT make that possible!
>
>
>
>
>
>
>
> 	    2008/4/15 Mike Ko <mako at almaden.ibm.com>:
>
>
>
> 	    VA is a concept introduced in an Infiniband annex to support
> iSER. It
> 	    appears in the expanded iSER header for Infiniband use only
> to support the
> 	    non-Zero Based Virtual Address (non-ZBVA) used in Infiniband
> vs the ZBVA
> 	    used in IETF.
>
>
>
> 	Mike - could you please put me in contact with someone who has
> actually
> 	implemented iSER on top of IETF/iWARP hardware NICs using ZBVA?
>
>
>
> 	<mk> Perhaps vendors who have built iSER stacks can comment on
> this.
>
>
> 	    "The DataDescriptorOut describes the I/O buffer starting
> with the immediate
> 	    unsolicited data (if any), followed by the non-immediate
> unsolicited data
> 	    (if any) and solicited data." If non-ZBVA mode is used, then
> VA points to
> 	    the beginning of this buffer. So in your example, the VA
> field in the
> 	    expanded iSER header will be zero. Note that for IETF, ZBVA
> is assumed and
> 	    there is no provision to specify a different VA in the iSER
> header.
>
>
>
> 	Mike - I believe this VA field in the expanded iSER header is
> almost
> 	NEVER zero -- it is always an actual virtual address.
>
>
>
> 	<mk> So if the OFED iSER stack is based on Annex 12, then it
> already has the means to select ZBVA or non-ZBVA as specified in the
> "iSER CM REQ Message Private Data Format" of table 2.  So rather than
> having to change the OFED implementation and update the Infiniband
> Annex, I suggest that we leave the ZBVA/non-ZBVA option alone even
> though ZBVA is never used as you said.
>
>
> 	    Tagged offset (TO) refers to the offset within a tagged
> buffer in RDMA Write
> 	    and RDMA Read Request Messages. When sending non-immediate
> unsolicited
> 	    data, Send Message types are used and the TO field is not
> present. Instead,
> 	    the buffer offset is appropriately represented by the Buffer
> Offset field in
> 	    the SCSI Data-Out PDU. Note that Tagged Offset is not the
> same as write VA
> 	    and it does not appear in the iSER header.
>
>
>
> 	    Mike
>
>
>
>
>
>