Re: [imss] [Ips] Storage Maintenance (storm) BOF reminder & requests

"Robert D. Russell" <> Sun, 22 March 2009 15:41 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 5E3C228C236; Sun, 22 Mar 2009 08:41:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id j6NJgVmJcrbb; Sun, 22 Mar 2009 08:41:12 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id EE6B328C246; Sun, 22 Mar 2009 08:41:11 -0700 (PDT)
Received: from (localhost []) by (8.14.0/8.14.0) with ESMTP id n2MFftaP017556 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Sun, 22 Mar 2009 11:41:55 -0400
Received: from localhost (rdr@localhost) by (8.14.0/8.14.0/Submit) with ESMTP id n2MFftrA017550; Sun, 22 Mar 2009 11:41:55 -0400
X-Authentication-Warning: rdr owned process doing -bs
Date: Sun, 22 Mar 2009 11:41:55 -0400 (EDT)
From: "Robert D. Russell" <>
In-Reply-To: <>
Message-ID: <>
References: <>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed
X-Mailman-Approved-At: Mon, 23 Mar 2009 09:28:31 -0700
Subject: Re: [imss] [Ips] Storage Maintenance (storm) BOF reminder & requests
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Internet and Management Support for Storage Working Group <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sun, 22 Mar 2009 15:58:48 -0000


There are 2 issues I would like to suggest for discussion at the BOF
meeting later this week.  Both have to do with the iSER spec, RFC 5046.

1. At the present time, as far as I know, no existing hardware,
    neither Infiniband nor iWARP, is capable of opening a connection
    in "normal" TCP mode and then transitioning it into zero-copy mode.
    Unfortunately, the iSER spec requires that.
    Can't we just replace that part of the iSER spec?
    Otherwise, all hardware and all implementations are non-standard.

2. The OFED stack is used to access both Infiniband and iWARP hardware.
    This software requires 2 extra 64-bit fields for addressing
    on both Infiniband and iWARP hardware, but these fields
    are not allowed for in the current iSER Header Format.
    Can't we just add those extra fields to the iSER spec?
    If someday some other implementation doesn't need those
    fields, they can be just set to 0 (which is what is implied by
    the current iSER standard anyway).  Again, by not doing this,
    all implementations are non-standard.

In other words, I'm suggesting that we consider replacing the relevant
parts of the current iSER specs with the current OFED specs on these
2 issues.

Thanks for your consideration,
Bob Russell

Note: The following (old) posting by Mike Ko states that the
extra header fields are needed only by IB, not by IETF
(i.e., iWARP), because IB uses nZBVA, whereas iWARP uses ZBVA.

But are there any IETF/iWARP implementations out there that actually
use ZBVA with iWARP RNICs?  (I don't mean software simulations of
the iWARP protocol.)  We have built an iSER implementation that
uses the OFED stack to access both IB and iWARP hardware, and for
both of them we need to use the extra iSER header fields (nZBVA).
Perhaps this is an issue with the design of the OFED stack, which
was built primarily to access IB hardware and therefore reflects
the needs of the IB hardware.  But we found that the only way to
access iWARP hardware via the OFED stack was to used the expanded
(nZBVA) iSER header (and to use a meaningful value in the extra field,
NOT to just set it to zero).

In any case, rather than have 2 different versions of the iSER header,
it would be better to have just one, regardless of the underlying
technology involved (after all, isn't that what a standard is for??).
This is especially relevant when using the OFED stack, because,
as we have demonstrated, software built on top of the OFED stack can
(AND SHOULD!) be able to run with EITHER IB or iWARP hardware,
with NO change to that software.  Having 2 different iSER headers
does NOT make that possible!

> 2008/4/15 Mike Ko <mako at>:
> VA is a concept introduced in an Infiniband annex to support iSER. It
> appears in the expanded iSER header for Infiniband use only to support the
> non-Zero Based Virtual Address (non-ZBVA) used in Infiniband vs the ZBVA
> used in IETF.

Mike - could you please put me in contact with someone who has actually
implemented iSER on top of IETF/iWARP hardware NICs using ZBVA?

> "The DataDescriptorOut describes the I/O buffer starting with the immediate
> unsolicited data (if any), followed by the non-immediate unsolicited data
> (if any) and solicited data." If non-ZBVA mode is used, then VA points to
> the beginning of this buffer. So in your example, the VA field in the
> expanded iSER header will be zero. Note that for IETF, ZBVA is assumed and
> there is no provision to specify a different VA in the iSER header.

Mike - I believe this VA field in the expanded iSER header is almost
NEVER zero -- it is always an actual virtual address.

> Tagged offset (TO) refers to the offset within a tagged buffer in RDMA Write
> and RDMA Read Request Messages. When sending non-immediate unsolicited
> data, Send Message types are used and the TO field is not present. Instead,
> the buffer offset is appropriately represented by the Buffer Offset field in
> the SCSI Data-Out PDU. Note that Tagged Offset is not the same as write VA
> and it does not appear in the iSER header.

On Wed, 11 Mar 2009, wrote:

> This is a reminder that the Storage Maintenance BOF will
> be held in about 2 weeks at the IETF meetings in San Francisco.
> Please plan to attend if you're interested:
> THURSDAY, March 26, 2009
> Continental 1&2  	TSV  	storm  	 Storage Maintenance BOF
> The BOF description is at:
> The initial agenda is here:
> I'm going to go upload that initial agenda as the BOF agenda,
> and it can be bashed at the meeting.
> The primary purpose of this BOF is to answer two questions:
> (1) What storage maintenance work (IP Storage, Remote Direct
> 	Data Placement) should be done?
> (2) Should an IETF Working Group be formed to undertake that
> 	work?
> Everyone gets to weigh in on these decisions, even those who
> can't attend the BOF meeting.  Anyone who thinks that there is
> work that should be done, and who cannot come to the BOF meeting
> should say so on the IPS or RDDP mailing lists (and it'd be a
> good idea for those who can come to do this).  As part of the
> email, please indicate how you're interested in helping (author
> or co-author of specific drafts, promise to review and comment
> on specific drafts).
> Here's a summary of the initial draft list of work items:
> - iSCSI: Combine RFCs into one document, removing unused features.
> - iSCSI: Interoperability report on what has been implemented and
> 	interoperates in support of Draft Standard status for iSCSI.
> - iSCSI: Add backwards-compatible features to support SAM-4.
> - iFCP: The Address Translation mode of iFCP needs to be deprecated.
> - RDDP MPA: Small startup update for MPI application support.
> - iSER: A few minor updates based on InfiniBand experience.
> Additional work (e.g., updated/improved iSNS for iSCSI, MIB changes,
> updated ipsec security profile [i.e., IKEv2-based]) is possible if
> there's interest.
> There are (at least) four possible outcomes:
> (A) None of this work needs to be done.
> (B) There are some small work items that make sense.  Individual
> 	drafts with a draft shepherd (i.e., David Black) will
> 	suffice.
> (C) A working group is needed to undertake more complex work
> 	items and reach consensus on design issues.  The WG can
> 	be "virtual" and operate mostly via the mailing list
> 	until/unless controversial/contentious issues arise.
> (D) There is a lot of complex work that is needed, and a WG
> 	that will plan to meet at every IETF meeting should be
> 	formed.
> Please note that the IETF "rough consensus" process requires a
> working group in practice to be effective.  This makes outcome
> (C) look attractive to me, as:
> - I'm coming under increasing pressure to limit travel, and
> 	the next two IETF meetings after San Francisco are not
> 	in the US.
> - I'd rather have the "rough consensus" process available and
> 	not need it than need it and not have it available.
> Setting an example for how to express interest ...
> ---------------
> I think that the iSCSI single RFC and interoperability report are
> good ideas, but I want to see a bunch of people expressing interest
> in these, as significant effort is involved.  It might make sense
> to do the single iSCSI RFC but put off the interoperability report
> (the resulting RFC would remain at Proposed Standard rather than
> going to Draft Standard), as I'm not hearing about major iSCSI
> interoperability issues.
> I think the latter four items (SAM-4 for iSCSI, deprecate iFCP
> address translation, MPI fix to MPA and iSER fixes) should all
> be done.
> I plan to author the iFCP address translation deprecation draft,
> and review all other drafts.
> I think that a virtual WG should be formed that plans to do its
> work primarily via the mailing list.  I believe the SAM-4 work
> by itself is complex enough to need a working group - I would
> expect design issues to turn up at least there and in determining
> whether to remove certain iSCSI features, but I'm cautiously
> optimistic that the mailing list is sufficient to work these
> issues out (and concerned that travel restrictions are likely to
> force use of the mailing list).
> -----------------
> Ok, who wants to go next?
> Thanks,
> --David
> ----------------------------------------------------
> David L. Black, Distinguished Engineer
> EMC Corporation, 176 South St., Hopkinton, MA  01748
> +1 (508) 293-7953             FAX: +1 (508) 293-7786
>        Mobile: +1 (978) 394-7754
> ----------------------------------------------------
> _______________________________________________
> Ips mailing list