[rddp] FW: Analysis of overhead for software implementation of markers verses key based framing

"Jim Pinkerton" <jpink@windows.microsoft.com> Thu, 05 December 2002 20:54 UTC

Received: from www1.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA27943 for <rddp-archive@odin.ietf.org>; Thu, 5 Dec 2002 15:54:49 -0500 (EST)
Received: (from mailnull@localhost) by www1.ietf.org (8.11.6/8.11.6) id gB5KvF622105 for rddp-archive@odin.ietf.org; Thu, 5 Dec 2002 15:57:15 -0500
Received: from ietf.org (odin.ietf.org [132.151.1.176]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id gB5KvFv22102 for <rddp-web-archive@optimus.ietf.org>; Thu, 5 Dec 2002 15:57:15 -0500
Received: from www1.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA27918 for <rddp-web-archive@ietf.org>; Thu, 5 Dec 2002 15:54:16 -0500 (EST)
Received: from www1.ietf.org (localhost.localdomain [127.0.0.1]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id gB5Ku4v22049; Thu, 5 Dec 2002 15:56:04 -0500
Received: from ietf.org (odin.ietf.org [132.151.1.176]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id gB5KtOv22014 for <rddp@optimus.ietf.org>; Thu, 5 Dec 2002 15:55:24 -0500
Received: from mail1.microsoft.com (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA27866 for <rddp@ietf.org>; Thu, 5 Dec 2002 15:52:26 -0500 (EST)
Received: from inet-vrs-01.redmond.corp.microsoft.com ([157.54.8.27]) by mail1.microsoft.com with Microsoft SMTPSVC(5.0.2195.5600); Thu, 5 Dec 2002 12:55:18 -0800
Received: from 157.54.8.155 by inet-vrs-01.redmond.corp.microsoft.com (InterScan E-Mail VirusWall NT); Thu, 05 Dec 2002 12:55:16 -0800
Received: from RED-IMC-02.redmond.corp.microsoft.com ([157.54.9.107]) by inet-hub-04.redmond.corp.microsoft.com with Microsoft SMTPSVC(5.0.2195.5600); Thu, 5 Dec 2002 12:55:15 -0800
Received: from WIN-IMC-02.wingroup.windeploy.ntdev.microsoft.com ([157.54.0.84]) by RED-IMC-02.redmond.corp.microsoft.com with Microsoft SMTPSVC(5.0.2195.5600); Thu, 5 Dec 2002 12:55:15 -0800
Received: from WIN-MSG-10.wingroup.windeploy.ntdev.microsoft.com ([157.54.12.81]) by WIN-IMC-02.wingroup.windeploy.ntdev.microsoft.com with Microsoft SMTPSVC(6.0.3710.0); Thu, 5 Dec 2002 12:55:06 -0800
X-MimeOLE: Produced By Microsoft Exchange V6.5.6803.0
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------InterScan_NT_MIME_Boundary"
Date: Thu, 05 Dec 2002 12:55:09 -0800
Message-ID: <E6564B8F86852D46A4E98C485FB33B8F054915@WIN-MSG-10.wingroup.windeploy.ntdev.microsoft.com>
Thread-Topic: Analysis of overhead for software implementation of markers verses key based framing
Thread-Index: AcHiO5oM04HLVKMMT4CIl5G4iJb1Si6YU9kQ
From: Jim Pinkerton <jpink@windows.microsoft.com>
To: rddp@ietf.org
X-OriginalArrivalTime: 05 Dec 2002 20:55:06.0867 (UTC) FILETIME=[9767FC30:01C29CA0]
Subject: [rddp] FW: Analysis of overhead for software implementation of markers verses key based framing
Sender: rddp-admin@ietf.org
Errors-To: rddp-admin@ietf.org
X-BeenThere: rddp@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/rddp>, <mailto:rddp-request@ietf.org?subject=unsubscribe>
List-Id: IETF Remote Direct Data Placement (rddp) WG <rddp.ietf.org>
List-Post: <mailto:rddp@ietf.org>
List-Help: <mailto:rddp-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/rddp>, <mailto:rddp-request@ietf.org?subject=subscribe>

Below is an analysis that I mailed out a while back to the group that
was looking at the framing issue -  they were trying to select between
the "magic key" based approach (see draft-ietf-tsvwg-tcp-ulp-frame-01
for the details) verses using markers. 

 

The short summary was that the additional overhead for a markers based
approach paled in comparison to the CRC overhead. And what markers did
do was get us out of the probabilistic world we'd been in with the
"magic key" based approach.

 

As one of the authors on draft-ietf-tsvwg-tcp-ulp-frame-01, we spent
about 1.5 years trying to make sure that the probabilistic approach,
which was much simpler than markers, would have a probability of data
corruption so close to zero that it was not significant. The discussion
occurred on the tsvwg reflector amoung others and culminated in the
draft-ietf-tsvwg-tcp-ulp-frame-01 ID. New loop-holes for silent data
corruption kept popping up - even 1.5 years into it.

 

The nice thing about a deterministic approach is exactly that - there
are no loop holes. 

 

 

Jim

 

 

 

 

------------------------------------------------------------------------
-------

 

After yesterday's internal review, the conclusion was that a key based
mechanism which includes a hash function based on [srcip, dstip,
srcport, dstport, TCPSeqNum, Len] is preferable to using a marker based
approach. However, this assumes that the application does not somehow
discover the "secret" - the initial sequence number, or ISN. If it does,
the probability of silent data corruption is possibly much less than the
chance of silent data corruption with a CRC. Badness.

 

So a key question is do we feel that this exposure is acceptable or
unacceptable? 

 

A second issue is the delta cost of going from a simple key based
algorithm to a marker based algorithm. I've attempted to analyze the
software overhead in an OS independent way.

 

I am assuming a CRC based approach for either algorithm - thus on
transmit and receive we have to make a sweep through the data. So we go
from zero-touch on transmit to one (for most OS's), and on receive into
the file system cache we go from zero touch to one touch (or to the SO
layer we go from 1 to 2 touches). And either approach assumes
preservation of record boundaries in the TCP stack (initial transmit and
retransmissions).

 

For the analysis of the delta between them, let's assume a 1500 byte MTU
and a 512 byte marker interval (if markers are used), and that the
current stack is optimized well enough to transmit a TCP segment with
only two mbufs (payload mbuf plus TCP/IP/MAC header mbuf) in the common
path (for some OS's an mbus is called an MDL, for other OS's, an mbuf is
a kbuf. Some OS's don't actually support zero-copy ever. Thus this
analysis is actually a worse performance delta than that OS will see).

 

The additional overhead (beyond CRC calc) for the key based approach is
marginal - calculate the current sequence number, hash a result, add it
to the end of an existing mbuf used for TCP/IP/MAC headers, plus add a
mbuf at the end for the CRC (thus 3 total mbufs). On receive, after
checking the CRC, we simply trim the mbuf list to remove the header and
CRC trailer. Trimming is fairly light-weight, allocs of mbufs can be
heavy weight.

 

For the marker based approach analysis, on transmit we have to add 3
markers, plus the CRC. Thus if we assume we don't copy the data to a new
buffer, a single payload mbuf is now sliced into 3 mbufs, and 3 marker
mbufs are added. Thus we go from 3 mbufs for key based to 8 mbufs for
marker based. Another approach would be to always copy the data to a new
buffer, and include as part of the copy the insertion of the marker and
the CRC calculation. The actual bcopy could be re-written to include the
CRC calculation or, after the data is assembled in the new buffer, the
CRC calculation could be run on it (keep in mind it will be hot in the
cache, so this will be substantially quicker than one might think). 

 

So I'm tempted to believe that the marker approach is not substantially
worse than the key based approach, given that we're signed up to do a
CRC of the data. Thus given that with markers you *never* will falsely
guess whether you are framed, it is worth the risk reduction in terms of
creating a protocol that will last many years (part of my concern is it
seems like every time we run a probability analysis, we keep finding new
cases that makes the probability assesment substantially worse than
originally predicted).

 

Comments?

 

 

Jim