RE: iSCSI reqmts and Ethernet adapters

"Douglas Otis" <dotis@sanlight.net> Tue, 24 April 2001 10:43 UTC

Received: from ece.cmu.edu ([128.2.236.200]) by ietf.org (8.9.1a/8.9.1a) with SMTP id GAA05798 for <ips-archive@odin.ietf.org>; Tue, 24 Apr 2001 06:43:49 -0400 (EDT)
Received: (from majordom@localhost) by ece.cmu.edu (8.11.0/8.10.2) id f3O50I720616 for ips-outgoing; Tue, 24 Apr 2001 01:00:18 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from gateway.sanlight.org (adsl-63-202-160-80.dsl.snfc21.pacbell.net [63.202.160.80]) by ece.cmu.edu (8.11.0/8.10.2) with ESMTP id f3O4xrA20586 for <ips@ece.cmu.edu>; Tue, 24 Apr 2001 00:59:57 -0400 (EDT)
Received: from ljoy (10.0.0.18.lan.sanlight.net [10.0.0.18]) by gateway.sanlight.org (8.11.0/8.11.0) with SMTP id f3O67P129777; Mon, 23 Apr 2001 23:07:25 -0700 (PDT) (envelope-from dotis@sanlight.net)
From: Douglas Otis <dotis@sanlight.net>
To: ips@ece.cmu.edu, marjorie_krueger@hp.com, Black_David@emc.com
Cc: sob@harvard.edu, egrodriguez@lucent.com, mankin@east.isi.edu
Subject: RE: iSCSI reqmts and Ethernet adapters
Date: Mon, 23 Apr 2001 21:57:30 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJOEMACGAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <0F31E5C394DAD311B60C00E029101A070801549E@corpmx9.isus.emc.com>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Sender: owner-ips@ece.cmu.edu
Precedence: bulk
Content-Transfer-Encoding: 7bit

David,

Thanks for the url correction for the document.  Although I posted this
document 4 days ago, it still has not been published as an I-D.  Here is a
reference to it until then.
http://ips.pdl.cs.cmu.edu/mail/msg04238.html

I'll stand by the stated intent of implementing this protocol in hardware.
The same is also true for iFCP and FCIP.

Page 8 of the requirements document:
    "(2)  Development of Ethernet storage NICs and related driver and
         protocol software; [NOTE: high-speed applications of iSCSI are
         expected to require significant portions of the iSCSI/TCP/IP
         implementation in hardware to achieve the necessary
         throughput.]"

Page 11:
  "2.3. Framing

   Framing refers to the addition of information in a header, or the
   data stream to allow implementations to locate the boundaries of an
   iSCSI protocol data unit (PDU) within the TCP byte stream.  There
   are two technical requirements driving framing: interfacing needs,
   and accelerated processing needs.

   A framing solution that addresses the "interfacing needs" of the
   iSCSI protocol will facilitate the implementation of a message-based
   upper layer protocol (iSCSI) on top of an underlying byte streaming
   protocol (TCP).  Since TCP is a reliable transport, this can be
   accomplished by including a length field in the iSCSI header.
   Finding the protocol frame assumes that the receiver will parse from
   the beginning of the TCP data stream, and never make a mistake (lose
   alignment on packet headers).

   The other technical requirement for framing, "accelerated
   processing", stems from the need to handle increasingly higher data
   rates in the physical media interface.  Two needs arise from higher
   data rates:

   (1)  LAN environment - NIC vendors seek ways to provide "zero-copy"
        methods of moving data directly from the wire into application
        buffers.

   (2)  WAN environment- the emergence of high bandwidth, high latency,
        low bit error rate physical media places huge buffer
        requirements on the physical interface solutions.

   First, vendors are producing network processing hardware that
   offloads network protocols to hardware solutions to achieve higher
   data rates.  The concept of "zero-copy" seeks to store blocks of
   data in appropriate memory locations (aligned) directly off the
   wire, even in when data is reordered due to packet loss.  This is
   necessary to drive actual data rates of 10 Gigabits and beyond.

   Secondly, in order for iSCSI to be successful in the WAN arena it
   MUST be possible to operate efficiently in high bandwidth, high
   delay networks.  The emergence of multi-gigabit IP networks with
   latencies in the tens to hundreds of milliseconds presents a
   challenge. To fill such large pipes, tens of megabytes of
   outstanding requests from the application are needed. In addition,
   some protocols potentially require tens of megabytes at the
   transport layer to deal with buffering for reassembly of data when
   packets are received out-of-order.

   Consider that a network pipe at 10 Gbps x 200 msec holds 250 MB.
   [Assume land-based communication with a spot half way around the
   world at the equator.  Ignore additional distance due to cable
   routing.  Ignore repeater and switching delays; consider only a
   speed-of-light delay of 5 microsec/km.  The circumference of the
   globe at the equator is approx. 40000 km (round-trip delay must be
   considered to keep the pipe full).  10 Gb/sec x 40000 km x 5
   microsec/km x B / 8b = 250 MB].  In a conventional TCP
   implementation, loss of a TCP segment means that stream processing
   MUST stop until that segment is recovered, which takes at least a
   time of <network round trip> to accomplish.  Following the example
   above, an implementation would be obliged to catch 250 MB of data
   into an anonymous buffer before resuming stream processing; later,
   this data would need to be moved to its proper location.  Some
   proponents of iSCSI seek some means of putting data directly where
   it belongs, and avoiding extra data movement in the case of segment
   drop.  This is a key concept in understanding the debate behind
   framing methodologies.

   The framing of the iSCSI protocol impacts both the "interfacing
   needs" and the "accelerated processing needs", however, while
   including a length in a header may suffice for the "interfacing
   needs", it will not serve the "accelerated processing needs". The
   framing mechanism developed should allow resynchronization of packet
   boundaries even in the case where a packet is temporarily missing in
   the incoming data stream."


Here IPS is developing a framing protocol that increases the level of error
detection.  The IPS has made explicit reference to these intentions of
having this protocol supported directly in hardware.  I will be happy to
show how this protocol can be mapped into a common structure which would
avail more protocols to this hardware acceleration at the same time ease the
transition to SCTP.

Although I may upset some with this message, I feel this is a pivotal point
in time.  Yes, it is possible to support this protocol in software, but not
at competitive data rates.  Of this you should be clear.  At this point in
time, with the iSCSI protocol in flux, it is impossible to make hardware for
this protocol.  With a minor amount of effort, this protocol could be placed
into a common format.  This becomes an architectural decision.  This is a
profound decision as it will determine the number of adapters needed by
systems to support these various protocols.  A decision that will impact the
next decade and likely influence networking profoundly.  These are not just
protocols.  You are attempting to redefine the nature of the network
adapter.

Presently you wish to see this done in a haphazard manner without any
coordination by the IETF.  This attitude does not reflect the magnitude of
this endeavor.  Frankly, had the IPS followed the efforts of the sigtran
group, there could have been and will be considerable time saved.  As it is
now, we are readopting multi-connection protocols with unique SACK packet
schemes, error handling etc.  In other words, re-inventing SCTP.  It took
them more than 2 years to get SCTP where they are now and it is in suitable
form for generic use.  If the effort is to use TCP, I can show how that can
be done as well by adopting SCTP structures.  iSCSI, RDMA, iFCP, and FCIP
should not need special devices made as a result of wanting improved error
detection, framing and data vectoring.  This is not rocket science, but not
using a common format makes it an adapter zoo.

Doug


David Black wrote:
> > This requirements document makes it clear there is expectation of
> > modifying Ethernet adapters to support this protocol.  Should this
> > required hardware support be made in a general fashion to allow
> > common use among other protocols?
>
> There are at least two announced iSCSI products and
> an open source driver that do not require any
> modifications to existing Ethernet adapters,
> so such modifications are clearly not a requirement.
>
> > This hardware requirement is primarily based on two
> > requirements, to increase the level of error detection and to allow
> > framing.
>
> Error detection (i.e., CRC or checksum) can be done in
> software as Doug has frequently pointed out on this list.
> I don't think the statement about IPS and TSVWG pursuing
> a common error detection algorithm is correct -- while
> I'll defer to the ADs (who are the chairs of TSVWG), I
> believe TSVWG has significant interest in an improved
> checksum (e.g., Adler-32 based on adding 16-bit quantities
> instead of 8-bit), whereas IPS intends to use CRCs.
>
> Framing is optional and being pursued in a layered fashion
> as called for by the WG charter.  The instructions in
> the WG charter should be sufficient - adding text to the
> iSCSI requirements document that introduces otherwise
> unneeded dependencies on other protocol specification
> efforts (i.e., iFCP, FCIP) is a bad idea.  Heaven help
> us if we have to submit all the protocol drafts for the
> IPS WG to the IESG in one big bundle - at the very least
> the IESG will be annoyed, and annoying the IESG has all
> sorts of bad side effects ;-).
>
> I don't see that any change to the iSCSI requirements draft
> is needed in either of these areas based on this set of
> comments.
>
> Doug also posted a pointer to the wrong draft - I suspect
> he meant to point to draft-otis-tcp-framing-00.txt.
>
> --David
>
> ---------------------------------------------------
> David L. Black, Senior Technologist
> EMC Corporation, 42 South St., Hopkinton, MA  01748
> +1 (508) 435-1000 x75140     FAX: +1 (508) 497-8500
> black_david@emc.com       Mobile: +1 (978) 394-7754
> ---------------------------------------------------
>
>
> > -----Original Message-----
> > From:	Douglas Otis [SMTP:dotis@sanlight.net]
> > Sent:	Monday, April 23, 2001 4:15 PM
> > To:	KRUEGER,MARJORIE (HP-Roseville,ex1); Ips Reflector (E-mail)
> > Cc:	Allison Mankin; David Black; Elizabeth G Rodriguez (Elizabeth);
> > Scott Bradner
> > Subject:	RE: I-D ACTION:draft-ietf-ips-iscsi-reqmts-03.txt
> >
> > Marjorie,
> >
> > This requirements document makes it clear there is expectation of
> > modifying Ethernet adapters to support this protocol.  Should this
> > required hardware support be made in a general fashion to allow
> > common use among other protocols?  This hardware requirement is
> > primarily based on two requirements, to increase the level of error
> > detection and to allow framing.  Presently, IETF supports a framing
> > protocol that also increases the level of error detection.
> >
> > Presently TSVWG and IPS are working on a common error detection
> > algorithm.  In addition, there are two other protocols expecting
> > hardware for framing and error detection.  This is iFCP and FCIP.
> >
> > See:
http://www.ietf.org/internet-drafts/draft-ietf-ips-fcencapsulation-00.txt
>
> > It is possible to have all these protocols use the same error detection
> > and framing.  If this MUST be done using TCP, as this requirement
> > document demands, then here is a possible general propose header that
> > would allow common use of hardware and a easy transition into SCTP.
>
> > I will be happy to define a mapping from the present protocols into this
> > generalized form.  The advantage should be obvious.  One Ethernet
adapter
> > can handle these various protocols without specialized hardware for
each.
>
> > For those wishing to update and route based on encapsulated headers, a
> > fix-up field at the end of these headers will allow use of a common
error
> > scheme using header fix-up.
>
> > Here is an example of how TCP can be made to look like SCTP.
http://www.ietf.org/internet-drafts/draft-otis-iscsi-fullack-00.txt
> >
> > This header could become a TCP option field to allow for negotiation.
> >
> > P.S.
> > One additional question however.
> >
> > On page 18,
> >  "The iSCSI protocol document SHOULD NOT define the management
> >  architecture for iSCSI within the network infrastructure."
> >
> > What does this mean?
> >
> > Doug
> >
> >
> > > The IP Storage Working group is chartered with developing
> > > comprehensive technology to transport block storage data
> > > over IP protocols.  This effort includes a protocol to
> > > transport the Small Computer Systems Interface (SCSI)
> > > protocol over the internet (iSCSI).
> > >
> > > A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-ips-iscsi-reqmts-03.txt
> > >