RE: [Ips] Recent comments about FCoE and iSCSI

"Larry Boucher" <> Fri, 27 April 2007 15:25 UTC

Return-path: <>
Received: from [] ( by with esmtp (Exim 4.43) id 1HhSJl-0003kc-NP; Fri, 27 Apr 2007 11:25:05 -0400
Received: from ips by with local (Exim 4.43) id 1Hh6AI-0002wR-TR for; Thu, 26 Apr 2007 11:45:50 -0400
Received: from ips by with local (Exim 4.43) id 1Hh6AI-0002wJ-K3 for; Thu, 26 Apr 2007 11:45:50 -0400
Received: from [] ( by with esmtp (Exim 4.43) id 1Hh64f-0003uH-Ur for; Thu, 26 Apr 2007 11:40:01 -0400
Received: from ([]) by with esmtp (Exim 4.43) id 1Hh64f-0002Dd-Dr for; Thu, 26 Apr 2007 11:40:01 -0400
Received: from minuteman (unknown []) (using TLSv1 with cipher RC4-MD5 (128/128 bits)) (No client certificate requested) (Authenticated sender: by (SMTP Server) with ESMTP id A851E5EC4E6; Thu, 26 Apr 2007 11:40:00 -0400 (EDT)
From: "Larry Boucher" <>
To: <>
Subject: RE: [Ips] Recent comments about FCoE and iSCSI
Date: Thu, 26 Apr 2007 08:39:56 -0700
Organization: Alacritech, Inc.
Message-ID: <05f101c78819$26c51a80$>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook, Build 10.0.6822
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.3028
Thread-Index: AceICjomfv0BkG35T9eK27tUkjQdqQACUqrw
In-Reply-To: <>
Importance: Normal
X-Spam-Score: 0.3 (/)
X-Scan-Signature: 90e8b0e368115979782f8b3d811b226b
X-TMDA-Confirmed: Thu, 26 Apr 2007 11:45:50 -0400
X-Mailman-Approved-At: Fri, 27 Apr 2007 11:25:04 -0400
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: IP Storage <>
List-Unsubscribe: <>, <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>

While it is hard to disagree with the accuracy of your observations, there
are a couple of points where I disagree, and an overall observation that I
would make.  First, you suggest that iSCSI is a problem with dropped packets
in a congested environment.  If you are comparing apples to apples, then the
same congested environment in a FC or FCoE environment will bring things to
a halt.  It would seem that lower performance is better than no performance.
In the perfect world required by FC/FCoE iSCSI performs beautifully.  This
would seem to give an advantage to iSCSI.

Your other major complaint involves long latencies caused by network
failure.  If this is a common occurrence in either environment, there are
much worse problems than latency issues.  While your point may be valid, it
would seem to me that covering five sigma events is just not a performance

FCoE is most easily related to UDP.  In similar network environments to
those for which FCoE is considered useful UDP is still much more efficient
than TCP.  However, despite this efficiency, over time UDP has fallen into
disuse.  Part of this is, as you suggest, driven by the increasing
performance of processors that allowed them to saturate the network running
TCP, but there are issues with this that I will try to address in the
following paragraph.  More importantly as networking and the internet have
grown, UDP has found fewer appropriate environments, and managing two
protocols to achieve the same result has become more expensive.  This is
despite the fact that UDP is still much more efficient than TCP in a noise
free environment.

As far as CPU performance and protocol processing are concerned, I believe
that theirs is a much misunderstood relationship.  The general purpose CPU
is designed to do general purpose functions.  Any system that is not moving
much data should certainly make use of the CPU, however inefficient, to move
the data, as garbage collection is one of the strengths of the CPU.  For any
system that has been designed for the specific purpose of moving data, the
general purpose CPU (irrespective of size or clock rate) is a very
inefficient mechanism.  Since the number of machines purchased for the
specific purpose of moving data (file service, web service, backup,
streaming, etc.) is of sufficient size, it would seem that specialized
hardware designed to do this efficiently would be of value.  While hardware
designed to move data via TCP is as you point out not simple, it is much
more cost effective, and uses much less power, than a general purpose CPU,
and if you want real complexity, take a look at that CPU.  It seems like
iSCSI has the best of both worlds--simplicity through the use of a NIC and
CPU in low demand environments, and speed and efficiency via TOE in high
demand environments.  And all without having to change protocols.

Larry Boucher

From: Zack Best []
Sent: Wednesday, April 25, 2007 1:38 PM
Subject: RE: [Ips] Recent comments about FCoE and iSCSI

The real debate here is between two types of networks.
 The first is reliable at the link level and does not
drop packets under congestion.  The second is running
a reliable transport protocol (i.e. TCP) over an
unreliable link level network.

I agree with the scaling argument.  For sufficiently
large networks, reliable link level doesn't work well
because network component failure, or chronically
congested links are not handled well.  For
sufficiently small networks, reliable link level has
some significant advantages in simplicity, low
hardware cost, performance, and worst case latency.

My personal view is that the vast majority of
enterprise storage networks fall in the "sufficiently
small" category.  This view has to some extent been
vindicated by the continuing success of Fibre Channel
in this space and the inability of iSCSI to displace
FC in any significant way for enterprise storage.  Of
course, this may or may not change in the future.

Whether FC is simpler than iSCSI depends largely on
your definition of simplicity.  If one defines
simplicity/complexity as the number of gates or lines
of code to reduce the protocol to hardware or
firmware, then my experience is that iSCSI is 2X to 3X
the complexity of FC.  This has implications in cost
and reliability.

Particularly problematic with iSCSI is the
unpredictability of the performance.  Performance is
great with no packet drop.  However even a small
amount of congestion can cause a sudden large drop and
performance.  This can be difficult to predict as a
network that is almost but not quite congested can run
great, but a small incremental change of any sort can
cause the performance to become suddenly unacceptable.
 For FC, or other protocol using link level flow
control, the reduction in performance is much more
graceful and incremental when the level of congestion
is small and intermittent.

A second major problem with iSCSI is the unbounded
nature of worst case latency.  When a storage network
fails, it is desirable to detect the failure in a
fraction of a second and transition to a backup
network.  TCP, when implemented to the standards, can
take many seconds or minutes to determine that a
network has failed and close the connection.  RFC
2988, for instance, requires that the minimum
retransmission be one second.  This means a single
dropped packet may add one second to the latency of
outstanding commands.  This is a huge amount of time
on a 10G link.  No doubt this could be mitigated by
drastically reducing the timeouts within TCP, but the
market seems to be surprisingly resistant to tampering
with accepted standards here.

Overall, the FC and FCP protocol have a lot in common
with the Intel i86 instruction set architecture.  They
are overly complex, and rather poorly designed by
modern standards.  But they are good enough, and there
is a huge amount of value add that has been built on
top of them, and therefore little incentive to change.
 FCoE is an interesting idea because it preserves 90%
of the existing value add of FC, unifies the physical
link with Ethernet, and uses the reliable link method
of packet delivery.

There are two significant possibilities for iSCSI to
displace FC (or FCoE) in enterprise storage networks.
First is if the networks start to scale to large
enough size that FC can't be made sufficiently
reliable, and second if CPU compute cycles become
sufficiently cheap that the iSCSI protocol can be run
in host software with no negative performance impact.
Barring either of these, it seems that iSCSI will have
an uphill battle, and FCoE may have a place.

 -----Original Message-----
From: Julian Satran []
Sent: Tuesday, April 24, 2007 3:10 PM
Subject: [Ips] Recent comments about FCoE and iSCSI

Dear All,

The trade press is lately full with comments about the
latest and greatest reincarnation of Fiber Channel
over ethernet.
It made me try and summarize all the long and hot
debates that preceded the advent of iSCSI.
Although FCoE proponents make it look like no debate
preceded iSCSI that was not so - FCoE was considered
even then and was dropped as a dumb idea.

Here is a summary (as afar as I can remember) of the
main arguments. They are not bad arguments even in
retrospect and technically FCoE doesn't look better
than it did then.

Feel free to use this material in a nay form. I expect
this group to seriously  expand my arguments and make
them public - in personal or collective form.

And do not forget - it is a technical dispute -
although we all must have some doubts about the way it
is pursued.



What a piece of nostalgia :-)

Around 1997 when a team at IBM Research (Haifa and
Almaden) started looking at connecting storage to
servers using the "regular network" (the ubiquitous
LAN) we considered many alternatives (another team
even had a look at ATM - still a computer network
candidate at the time). I won't get you over all of
our rationale (and we went over some of them again at
the end of 1999 with a team from CISCO before we
convened the first IETF BOF in 2000 at Adelaide that
resulted in iSCSI and all the rest) but some of the
reasons we choose to drop Fiber Channel over raw
Ethernet where multiple:

Fiber Channel Protocol (SCSI over Fiber Channel Link)
is "mildly" effective because:
it implements endpoints in a dedicated engine
it has no transport layer (recovery is done at the
application layer under the assumption that the error
rate will be very low)
the network is limited in physical span and logical
span (number of switches)
flow-control/congestion control is achieved with a
mechanism adequate for a limited span network
(credits). The packet loss rate is almost nil and that
allows FCP to avoid using a transport (end-to-end)
FCP she switches are simple (addresses are local and
the memory requirements cam be limited through the
credit mechanism)
However FCP endpoints are inherently costlier than
simple NICs - the cost argument (initiators are more
The credit mechanisms is highly unstable for large
networks (check switch vendors planning docs for the
network diameter limits) - the scaling argument
The assumption of low losses due to errors might
radically change when moving from 1 to 10 Gb/s - the
scaling argument
Ethernet has no credit mechanism and any mechanism
with a similar effect increases the end point cost.
Building a transport layer in the protocol stack has
always been the preferred choice of the networking
community - the community argument
The "performance penalty" of a complete protocol stack
has always been overstated (and overrated). Advances
in protocol stack implementation and finer tuning of
the congestion control mechanisms make conventional
TCP/IP performing well even at 10 Gb/s and over.
Moreover the multicore processors that become dominant
on the computing scene have enough compute cycles
available to make any "offloading" possible as a mere
code restructuring exercise (see the stack reports
from Intel, IBM etc.)
Building on a complete stack makes available a wealth
of operational and management mechanisms built over
the years by the networking community (routing,
provisioning, security, service location etc.) - the
community argument
Higher level storage access over an IP network is
widely available and having both block and file served
over the same connection with the same support and
management structure is compelling - the community
Highly efficient networks are easy to build over IP
with optimal (shortest path) routing while Layer 2
networks use bridging and are limited by the logical
tree structure that bridges must follow. The effort to
combine routers and bridges (rbridges) is promising to
change that but it will take some time to finalize
(and we don't know exactly how it will operate).
Untill then the scale of Layer 2 network is going to
seriously limited - the scaling argument

As a side argument - a performance comparison made in
1998 showed SCSI over TCP (a predecessor of the later
iSCSI) to perform better than FCP at 1Gbs for block
sizes typical for OLTP (4-8KB). That was what
convinced us to take the path that lead to iSCSI - and
we used plain vanilla x86 servers with plain-vanilla
NICs and Linux (with similar measurements conducted on
The networking and storage community acknowledged
those arguments and developed iSCSI and the companion
protocols for service discovery, boot etc.

The community also acknowledged the need to support
existing infrastructure and extend it in a reasonable
fashion and developed 2 protocols iFCP (to support
hosts with FCP drivers and IP connections to connect
to storage by a simple conversion from FCP to TCP
packets) FCPIP to extend the reach of FCP through IP
(connects FCP islands through TCP links). Both have
implemented and their foundation is solid.

The current attempt of developing a "new-age" FCP over
an Ethernet link is going against most of the
arguments that have given us iSCSI etc.

It ignores the networking layering practice, build an
application protocol directly above a link and thus
limits scaling, mandates elements at the link layer
and application layer that make applications more
expensive and leaves aside the whole "ecosystem" that
accompanies TCP/IP (and not Ethernet).

In some related effort (and at a point also when
developing iSCSI) we considered also moving away from
SCSI (like some "no standardized" but popular in some
circles software did - e.g., NBP) but decided against.
SCSI is a mature and well understood access
architecture for block storage and is implemented by
many device vendors. Moving away from it would not
have been justified at the time.

Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around

Ips mailing list

Ips mailing list