[storm] iSER update request from Bob Russell

Mike Ko <Michael@huaweisymantec.com> Thu, 20 August 2009 22:24 UTC

Return-Path: <Michael@huaweisymantec.com>
X-Original-To: storm@core3.amsl.com
Delivered-To: storm@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 1D98D3A6A0D for <storm@core3.amsl.com>; Thu, 20 Aug 2009 15:24:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.598
X-Spam-Level:
X-Spam-Status: No, score=-2.598 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tUzEoU+Z40mJ for <storm@core3.amsl.com>; Thu, 20 Aug 2009 15:24:30 -0700 (PDT)
Received: from mta1.huaweisymantec.com (mta1.huaweisymantec.com [218.17.155.14]) by core3.amsl.com (Postfix) with ESMTP id 61DD43A6825 for <storm@ietf.org>; Thu, 20 Aug 2009 15:24:30 -0700 (PDT)
MIME-version: 1.0
Content-type: multipart/alternative; boundary="Boundary_(ID_qp9iQSgj16heiXPfPlUK9w)"
Received: from hstml01-in.huaweisymantec.com ([172.26.3.41]) by hstga01-in.huaweisymantec.com (Sun Java(tm) System Messaging Server 6.3-8.03 (built Apr 24 2009; 32bit)) with ESMTP id <0KOP008424WBOS80@hstga01-in.huaweisymantec.com> for storm@ietf.org; Fri, 21 Aug 2009 06:24:11 +0800 (CST)
Received: from LENOVO6EA8F9DF ([68.65.79.146]) by hstml01-in.huaweisymantec.com (Sun Java(tm) System Messaging Server 6.3-8.03 (built Apr 24 2009; 32bit)) with ESMTPA id <0KOP00CHK4W8W400@hstml01-in.huaweisymantec.com> for storm@ietf.org; Fri, 21 Aug 2009 06:24:11 +0800 (CST)
Message-id: <E265A5696240423CB3579D112E454FDB@china.huawei.com>
From: Mike Ko <Michael@huaweisymantec.com>
To: STORM <storm@ietf.org>
Date: Thu, 20 Aug 2009 15:24:38 -0700
X-Priority: 3
X-MSMail-priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2900.5843
X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2900.5579
Subject: [storm] iSER update request from Bob Russell
X-BeenThere: storm@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Storage Maintenance WG <storm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/storm>, <mailto:storm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/storm>
List-Post: <mailto:storm@ietf.org>
List-Help: <mailto:storm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/storm>, <mailto:storm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 20 Aug 2009 22:24:32 -0000

On March 22, 2009, Bob Russell posted the following at the IPS mailing list.  I have embedded my responses below prefixed by <mk>.

Mike

-----  from Bob Russell  -----

There are 2 issues I would like to suggest for discussion at the BOF
meeting later this week.  Both have to do with the iSER spec, RFC 5046.

1. At the present time, as far as I know, no existing hardware,
   neither Infiniband nor iWARP, is capable of opening a connection
   in "normal" TCP mode and then transitioning it into zero-copy mode.
   Unfortunately, the iSER spec requires that.
   Can't we just replace that part of the iSER spec?
   Otherwise, all hardware and all implementations are non-standard.

<mk> When I wrote the Supplement to Infiniband Architecture Specification Annex A12 (Support for iSCSI Extensions for RDMA), there was no mention of transitioning the connection from TCP mode to RDMA mode.  I can update update RFC 5046 to remove this requirement if that is the consensus of the group.

2. The OFED stack is used to access both Infiniband and iWARP hardware.
   This software requires 2 extra 64-bit fields for addressing
   on both Infiniband and iWARP hardware, but these fields
   are not allowed for in the current iSER Header Format.
   Can't we just add those extra fields to the iSER spec?
   If someday some other implementation doesn't need those
   fields, they can be just set to 0 (which is what is implied by
   the current iSER standard anyway).  Again, by not doing this,
   all implementations are non-standard.

<mk> I assume you are suggesting that we should take the "Expanded iSER Header for Supporting Virtual Address" as defined in table 4 of Annex A12 and update RFC 5046 accordingly.  Again, I am fine with this if that is the consensus of the group.

In other words, I'm suggesting that we consider replacing the relevant
parts of the current iSER specs with the current OFED specs on these
2 issues.

Thanks for your consideration,
Bob Russell

Note: The following (old) posting by Mike Ko states that the
extra header fields are needed only by IB, not by IETF
(i.e., iWARP), because IB uses nZBVA, whereas iWARP uses ZBVA.

But are there any IETF/iWARP implementations out there that actually
use ZBVA with iWARP RNICs?  (I don't mean software simulations of
the iWARP protocol.)  We have built an iSER implementation that
uses the OFED stack to access both IB and iWARP hardware, and for
both of them we need to use the extra iSER header fields (nZBVA).
Perhaps this is an issue with the design of the OFED stack, which
was built primarily to access IB hardware and therefore reflects
the needs of the IB hardware.  But we found that the only way to
access iWARP hardware via the OFED stack was to used the expanded
(nZBVA) iSER header (and to use a meaningful value in the extra field,
NOT to just set it to zero).

In any case, rather than have 2 different versions of the iSER header,
it would be better to have just one, regardless of the underlying
technology involved (after all, isn't that what a standard is for??).
This is especially relevant when using the OFED stack, because,
as we have demonstrated, software built on top of the OFED stack can
(AND SHOULD!) be able to run with EITHER IB or iWARP hardware,
with NO change to that software.  Having 2 different iSER headers
does NOT make that possible!



    2008/4/15 Mike Ko <mako at almaden.ibm.com>:

    VA is a concept introduced in an Infiniband annex to support iSER. It
    appears in the expanded iSER header for Infiniband use only to support the
    non-Zero Based Virtual Address (non-ZBVA) used in Infiniband vs the ZBVA
    used in IETF.

Mike - could you please put me in contact with someone who has actually
implemented iSER on top of IETF/iWARP hardware NICs using ZBVA?

<mk> Perhaps vendors who have built iSER stacks can comment on this.

    "The DataDescriptorOut describes the I/O buffer starting with the immediate
    unsolicited data (if any), followed by the non-immediate unsolicited data
    (if any) and solicited data." If non-ZBVA mode is used, then VA points to
    the beginning of this buffer. So in your example, the VA field in the
    expanded iSER header will be zero. Note that for IETF, ZBVA is assumed and
    there is no provision to specify a different VA in the iSER header.

Mike - I believe this VA field in the expanded iSER header is almost
NEVER zero -- it is always an actual virtual address.

<mk> So if the OFED iSER stack is based on Annex 12, then it already has the means to select ZBVA or non-ZBVA as specified in the "iSER CM REQ Message Private Data Format" of table 2.  So rather than having to change the OFED implementation and update the Infiniband Annex, I suggest that we leave the ZBVA/non-ZBVA option alone even though ZBVA is never used as you said.

    Tagged offset (TO) refers to the offset within a tagged buffer in RDMA Write
    and RDMA Read Request Messages. When sending non-immediate unsolicited
    data, Send Message types are used and the TO field is not present. Instead,
    the buffer offset is appropriately represented by the Buffer Offset field in
    the SCSI Data-Out PDU. Note that Tagged Offset is not the same as write VA
    and it does not appear in the iSER header.

    Mike