[rddp] FW: Analysis of overhead for software implementation of markers verses key based framing
"Jim Pinkerton" <jpink@windows.microsoft.com> Thu, 05 December 2002 20:54 UTC
Received: from www1.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA27943 for <rddp-archive@odin.ietf.org>; Thu, 5 Dec 2002 15:54:49 -0500 (EST)
Received: (from mailnull@localhost) by www1.ietf.org (8.11.6/8.11.6) id gB5KvF622105 for rddp-archive@odin.ietf.org; Thu, 5 Dec 2002 15:57:15 -0500
Received: from ietf.org (odin.ietf.org [132.151.1.176]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id gB5KvFv22102 for <rddp-web-archive@optimus.ietf.org>; Thu, 5 Dec 2002 15:57:15 -0500
Received: from www1.ietf.org (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA27918 for <rddp-web-archive@ietf.org>; Thu, 5 Dec 2002 15:54:16 -0500 (EST)
Received: from www1.ietf.org (localhost.localdomain [127.0.0.1]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id gB5Ku4v22049; Thu, 5 Dec 2002 15:56:04 -0500
Received: from ietf.org (odin.ietf.org [132.151.1.176]) by www1.ietf.org (8.11.6/8.11.6) with ESMTP id gB5KtOv22014 for <rddp@optimus.ietf.org>; Thu, 5 Dec 2002 15:55:24 -0500
Received: from mail1.microsoft.com (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA27866 for <rddp@ietf.org>; Thu, 5 Dec 2002 15:52:26 -0500 (EST)
Received: from inet-vrs-01.redmond.corp.microsoft.com ([157.54.8.27]) by mail1.microsoft.com with Microsoft SMTPSVC(5.0.2195.5600); Thu, 5 Dec 2002 12:55:18 -0800
Received: from 157.54.8.155 by inet-vrs-01.redmond.corp.microsoft.com (InterScan E-Mail VirusWall NT); Thu, 05 Dec 2002 12:55:16 -0800
Received: from RED-IMC-02.redmond.corp.microsoft.com ([157.54.9.107]) by inet-hub-04.redmond.corp.microsoft.com with Microsoft SMTPSVC(5.0.2195.5600); Thu, 5 Dec 2002 12:55:15 -0800
Received: from WIN-IMC-02.wingroup.windeploy.ntdev.microsoft.com ([157.54.0.84]) by RED-IMC-02.redmond.corp.microsoft.com with Microsoft SMTPSVC(5.0.2195.5600); Thu, 5 Dec 2002 12:55:15 -0800
Received: from WIN-MSG-10.wingroup.windeploy.ntdev.microsoft.com ([157.54.12.81]) by WIN-IMC-02.wingroup.windeploy.ntdev.microsoft.com with Microsoft SMTPSVC(6.0.3710.0); Thu, 5 Dec 2002 12:55:06 -0800
X-MimeOLE: Produced By Microsoft Exchange V6.5.6803.0
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="------------InterScan_NT_MIME_Boundary"
Date: Thu, 05 Dec 2002 12:55:09 -0800
Message-ID: <E6564B8F86852D46A4E98C485FB33B8F054915@WIN-MSG-10.wingroup.windeploy.ntdev.microsoft.com>
Thread-Topic: Analysis of overhead for software implementation of markers verses key based framing
Thread-Index: AcHiO5oM04HLVKMMT4CIl5G4iJb1Si6YU9kQ
From: Jim Pinkerton <jpink@windows.microsoft.com>
To: rddp@ietf.org
X-OriginalArrivalTime: 05 Dec 2002 20:55:06.0867 (UTC) FILETIME=[9767FC30:01C29CA0]
Subject: [rddp] FW: Analysis of overhead for software implementation of markers verses key based framing
Sender: rddp-admin@ietf.org
Errors-To: rddp-admin@ietf.org
X-BeenThere: rddp@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/rddp>, <mailto:rddp-request@ietf.org?subject=unsubscribe>
List-Id: IETF Remote Direct Data Placement (rddp) WG <rddp.ietf.org>
List-Post: <mailto:rddp@ietf.org>
List-Help: <mailto:rddp-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/rddp>, <mailto:rddp-request@ietf.org?subject=subscribe>
Below is an analysis that I mailed out a while back to the group that was looking at the framing issue - they were trying to select between the "magic key" based approach (see draft-ietf-tsvwg-tcp-ulp-frame-01 for the details) verses using markers. The short summary was that the additional overhead for a markers based approach paled in comparison to the CRC overhead. And what markers did do was get us out of the probabilistic world we'd been in with the "magic key" based approach. As one of the authors on draft-ietf-tsvwg-tcp-ulp-frame-01, we spent about 1.5 years trying to make sure that the probabilistic approach, which was much simpler than markers, would have a probability of data corruption so close to zero that it was not significant. The discussion occurred on the tsvwg reflector amoung others and culminated in the draft-ietf-tsvwg-tcp-ulp-frame-01 ID. New loop-holes for silent data corruption kept popping up - even 1.5 years into it. The nice thing about a deterministic approach is exactly that - there are no loop holes. Jim ------------------------------------------------------------------------ ------- After yesterday's internal review, the conclusion was that a key based mechanism which includes a hash function based on [srcip, dstip, srcport, dstport, TCPSeqNum, Len] is preferable to using a marker based approach. However, this assumes that the application does not somehow discover the "secret" - the initial sequence number, or ISN. If it does, the probability of silent data corruption is possibly much less than the chance of silent data corruption with a CRC. Badness. So a key question is do we feel that this exposure is acceptable or unacceptable? A second issue is the delta cost of going from a simple key based algorithm to a marker based algorithm. I've attempted to analyze the software overhead in an OS independent way. I am assuming a CRC based approach for either algorithm - thus on transmit and receive we have to make a sweep through the data. So we go from zero-touch on transmit to one (for most OS's), and on receive into the file system cache we go from zero touch to one touch (or to the SO layer we go from 1 to 2 touches). And either approach assumes preservation of record boundaries in the TCP stack (initial transmit and retransmissions). For the analysis of the delta between them, let's assume a 1500 byte MTU and a 512 byte marker interval (if markers are used), and that the current stack is optimized well enough to transmit a TCP segment with only two mbufs (payload mbuf plus TCP/IP/MAC header mbuf) in the common path (for some OS's an mbus is called an MDL, for other OS's, an mbuf is a kbuf. Some OS's don't actually support zero-copy ever. Thus this analysis is actually a worse performance delta than that OS will see). The additional overhead (beyond CRC calc) for the key based approach is marginal - calculate the current sequence number, hash a result, add it to the end of an existing mbuf used for TCP/IP/MAC headers, plus add a mbuf at the end for the CRC (thus 3 total mbufs). On receive, after checking the CRC, we simply trim the mbuf list to remove the header and CRC trailer. Trimming is fairly light-weight, allocs of mbufs can be heavy weight. For the marker based approach analysis, on transmit we have to add 3 markers, plus the CRC. Thus if we assume we don't copy the data to a new buffer, a single payload mbuf is now sliced into 3 mbufs, and 3 marker mbufs are added. Thus we go from 3 mbufs for key based to 8 mbufs for marker based. Another approach would be to always copy the data to a new buffer, and include as part of the copy the insertion of the marker and the CRC calculation. The actual bcopy could be re-written to include the CRC calculation or, after the data is assembled in the new buffer, the CRC calculation could be run on it (keep in mind it will be hot in the cache, so this will be substantially quicker than one might think). So I'm tempted to believe that the marker approach is not substantially worse than the key based approach, given that we're signed up to do a CRC of the data. Thus given that with markers you *never* will falsely guess whether you are framed, it is worth the risk reduction in terms of creating a protocol that will last many years (part of my concern is it seems like every time we run a probability analysis, we keep finding new cases that makes the probability assesment substantially worse than originally predicted). Comments? Jim
- [rddp] FW: Analysis of overhead for software impl… Jim Pinkerton