[Tsvwg] "DO IT RIGHT", or "DON'T DO IT AT ALL"

"Williams, Jim" <Jim.Williams@Emulex.com> Thu, 05 December 2002 19:08 UTC

Received: from www1.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA24230 for <tsvwg-archive@odin.ietf.org>; Thu, 5 Dec 2002 14:08:27 -0500 (EST)
Received: (from mailnull@localhost) by www1.ietf.org (8.11.6/8.11.6) id gB5JAps15719 for tsvwg-archive@odin.ietf.org; Thu, 5 Dec 2002 14:10:51 -0500
Received: from www1.ietf.org (localhost.localdomain [127.0.0.1]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id gB5JAUv15656; Thu, 5 Dec 2002 14:10:30 -0500
Received: from ietf.org (odin.ietf.org [132.151.1.176]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id gB5J9dv15584 for <tsvwg@optimus.ietf.org>; Thu, 5 Dec 2002 14:09:39 -0500
Received: from emulex.emulex.com (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id OAA24160; Thu, 5 Dec 2002 14:06:38 -0500 (EST)
Received: from xbl.ad.emulex.com (xbl.ma.emulex.com [172.16.12.11]) by emulex.emulex.com (8.9.1a/8.9.1) with ESMTP id LAA17996; Thu, 5 Dec 2002 11:09:27 -0800 (PST)
Received: by xbl.ma.emulex.com with Internet Mail Service (5.5.2653.19) id <XTGNDNV9>; Thu, 5 Dec 2002 14:05:24 -0500
Message-ID: <3356669BBE90C448AD4645C843E2BF289B8F59@xbl.ma.emulex.com>
From: "Williams, Jim" <Jim.Williams@Emulex.com>
To: "'tsvwg@ietf.org'" <tsvwg@ietf.org>
Cc: "'rddp@ietf.org'" <rddp@ietf.org>
Date: Thu, 05 Dec 2002 14:05:23 -0500
MIME-Version: 1.0
X-Mailer: Internet Mail Service (5.5.2653.19)
Content-Type: text/plain; charset="iso-8859-1"
Subject: [Tsvwg] "DO IT RIGHT", or "DON'T DO IT AT ALL"
Sender: tsvwg-admin@ietf.org
Errors-To: tsvwg-admin@ietf.org
X-BeenThere: tsvwg@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=unsubscribe>
List-Id: Transport Area Working Group <tsvwg.ietf.org>
List-Post: <mailto:tsvwg@ietf.org>
List-Help: <mailto:tsvwg-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/tsvwg>, <mailto:tsvwg-request@ietf.org?subject=subscribe>

The subject under discussion here is transforming
TCP into a record aligned transport to allow receivers
to do some level of ULP processing prior reassembly
of the TCP stream.  The specific goal is to enable
an RDDP protocol to directly place incoming
data even if TCP segments arrive out of order.

At least two drafts have been submitted which
propose mechanisms:

     draft-ietf-tsvwg-tcp-ulp-frame-01.txt [tuf] (expired)
and
     draft-culley-iwarp-mpa-00.txt [mpa]

In Atlanta I argued the case for "DON'T DO IT AT ALL".
The case being that TCP is a stream protocol and should
not be used otherwise.  SCTP is a record protocol and
should be used when a record protocol is needed.
And that "fixing" TCP rather than switching to SCTP
is the wrong architectural direction.

However in that aftermath of Atlanta I fear that
my argument is losing within the RDDP group, so I would
like to at least explore the alternative.

Under the heading of "DO IT RIGHT", both drafts
fall a bit short for the same reason.  The reason being
they both went to great lengths to minimize any changes
to TCP.  So I would like to explore the possibility that
changing TCP "just a little bit" may not be the
right approach to solving the problem, and that TCP
should be changed to the extent necessary to solve the
problem the right way.  Of course, this needs
to be done in a way that does not break interoperability.

The guiding architectural principal for "doing it
right" is that the receiver should have some level of
guarantee that it will always get record aligned TCP
segments, and that any unaligned segments can and should
be considered an error case, resulting in a closed
connection.

Both [tuf] and [mpa] state
that the receiver might or might not get record aligned
segments, and they both add another layer of protocol
processing to allow the receiver to determine whether
the alignment attempt was or was not successful.  They
both require that the receiver to support the dual 
complexity of handling both aligned and unaligned
segments on a given connection.

To me, this seems to have the type of undesirable 
complexity often associated with ill advised, ad hoc
solutions.  I would encourage that some thought
be given to a more cleanly architected solutions
even if the down side is greater impact on TCP
bahavior.

There are two fundamental problems with guaranteeing
record alignment to the receiver.

  1.  How can the sender always align.

  2.  How can middle-boxes be prevented from
      re-segmenting the TCP stream.

With respect to 1., both drafts state that alignment
will usually be done, but list a number of exception
cases under which the sender will not attempt to 
align.  

I would propose as part of "doing it right" that
a sender which agrees to align must ALWAYS align
records with TCP segments.  I will not attempt
now to list here all the implication of this, but 
the summary is that this will require
small but substantive changes to TCP
sender behavior, and that these changes, if done
right, will not break interoperability with 
existing implementations.  (However this is
a tricky area, and until the work is done, 
it is not possible to be sure.)

With respect to 2., I would propose that senders
that agree to align should announce this intention
by adding a newly defined RECORD_ALIGNED TCP option
to the SYN segments used to set up the connection.

This is a reasonable architectural solution in
that middle-boxes that want or need to re-segment
the TCP stream should not include this RECORD_ALIGNED
option in their outgoing SYN segments.

It is admittedly only a partial solution in a 
practical sense.  Undoubtedly, some fraction
(hopefully small) of existing re-segmenting
middle-boxes will transparently pass through
unrecognized TCP options.

However, in practice this will not be a big enough
problem that it should drive protocol design in a
major way.  Middle boxes that re-segment, and also
send a TCP option agreeing not to re-segment
are engaging in aberrant and erroneous behavior,
and will cause connections to quickly close in error.
The result is either they will be fixed, or
the affected receivers will be configured to
not assume alignment (ignore incoming RECORD_ALIGNED
options).

It is hoped that CRCs used for data integrity
will be sufficient to also protect against aberrant
middle-box behavior.  One can certainly construct
hypothetical scenarios that defeat this, but
these are unlikely to be a problem in practice.



_______________________________________________
tsvwg mailing list
tsvwg@ietf.org
https://www1.ietf.org/mailman/listinfo/tsvwg