RE: A Transport Protocol Without ACK

"Y P Cheng" <ycheng@advansys.com> Sat, 16 September 2000 19:31 UTC

Return-Path: <owner-ips@ECE.cmu.edu>
Received: from ECE.CMU.EDU by cnoc.pdl.cs.cmu.edu id aa32482; 16 Sep 2000 15:31 EDT
Received: by ece.cmu.edu (8.9.2/8.8.8) id OAA00082 for ips-outgoing; Sat, 16 Sep 2000 14:31:57 -0400 (EDT)
X-Authentication-Warning: ece.cmu.edu: majordom set sender to owner-ips@ece.cmu.edu using -f
Received: from main.connectcom.net (anubis.advansys.com [204.247.22.2]) by ece.cmu.edu (8.9.2/8.8.8) with ESMTP id OAA00078 for <ips@ece.cmu.edu>; Sat, 16 Sep 2000 14:31:52 -0400 (EDT)
Received: from yp_portable (slip-32-102-64-254.ca.us.prserv.net [32.102.64.254]) by main.connectcom.net with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2650.21) id S3VDLY5L; Sat, 16 Sep 2000 11:29:59 -0700
From: Y P Cheng <ycheng@advansys.com>
To: "'Ips@Ece. Cmu. Edu'" <ips@ece.cmu.edu>
Subject: RE: A Transport Protocol Without ACK
Date: Sat, 16 Sep 2000 11:30:24 -0700
Message-ID: <000701c0200c$2dfe22a0$fe406620@yp_portable.advansys.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook 8.5, Build 4.71.2173.0
Importance: Normal
X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3110.3
In-Reply-To: <0F31E5C394DAD311B60C00E029101A0704100F9B@corpmx9.isus.emc.com>
Sender: owner-ips@ece.cmu.edu
Precedence: bulk

David Black wrote:
> < From: Y.P. Cheng>
> > The main difference between my view and that of ips discussions is the
> > semantics on creating and interpreting the iSCSI headers.  The working
> > group assumes the iSCSI in using TCP/IP will need several system calls:
> > write command, read/write data, and read status. Hence, iSCSI will have
the
> > problem of interlocking.  The working group tries to solve the problem
> > with asymmetric and symmetric models.  However, if the group assumes a
> > single system call to send an iSCSI request, we can let the transport
layer
> > break up iSCSI requests into PDUs of command, data, and status.  The
> > asymmetricmodel is no longer needed.
> I think "system calls" is the wrong word.  The whole point of
> sessions is to take advantage of network fabric parallelism,
> i.e., use two or more NICs on initiator and target.
> This requires the transport layer to figure out which PDU
> to send on which NIC (i.e., which TCP connection) independent of what's
> implemented in hardware or software, and entails logic on the far side to
> put things back in order.  This logic could be in iSCSI or a
> transport like SCTP.
> <snip>
> Because reliable delivery isn't enough.  Congestion control is also
> required.  TCP does congestion control in a fashion that no existing
> SCSI transport or mechanism does.  It's not the only solution, but it
> or an equivalent congestion control solution is REQUIRED.  Please read
>
> http://www.ietf.org/internet-drafts/draft-floyd-cong-04.txt
>

Thank you for pointing out that in addition to mapping a SCSI request to
iSCSI PDU's, the working group is also addressing the delivery of PDU's to
prevent deadlock when TCP/IP is used.  It is also assumed that TCP/IP
provides the congestion and flow control. Whether I like it or not, the
reality is that many iSCSI implementation will be TCP/IP.  However, I do
have concerns if iSCSI requires the ACK to achieve reliable delivery.  Let
me use the following example to illustrate my concern.

Assume we wish to perform backup to a device 3000 miles away using iSCSI
protocol.  It would take 10 milliseconds for an IP packet to travel from the
source to destination.  Similarly, it takes another 10 milliseconds for the
ACK to come back.  Lets also assume that the backbone is capable of 1
gigabit per second throughput.  To keep data streaming on this connection,
the source needs to send 2 megabytes of data before seeing the first ACK
coming back.  Similarly, the target must be prepared to buffer 2 megabyte of
data.  This example becomes much more interesting when we increase the
backbone connection speed to that of OC-192 at 10 gigabits per second or if
the backup devices are accepting incoming streams from multiple initiators.
It needs a lot of memory.

We could take a lesson from the fibre channel FCP using class-3 datagram.
The SCSI protocol is inherited acknowledged by the returning of status PDU.
If an ACK to datagrams is not necessary, then the 2MB buffer requirement in
the above example can be reduced to less than 100KB to accommodate the speed
variation of the host system bus and the connecting media.  By the way,
multiple paths does not solve this problem.  The problem of lost PDU's due
to traffic jam or congested system bus are also still there. However, the
congestion and retransmission problem has already been solved in the
implementation of a fibre channel adapter.  TCP/IP is not the only protocol
providing a solution.

I guess my point is that the iSCSI should allow a transport protocol that
does not require ACK.  (My apology for mentioning UDP in my previous
postings.)  If iSCSI is not limiting itself to TCP/IP, then, the asymmetric
model is also a non-issue for people who design a smart NIC adapter.