Re: [tcpm] draft-eggert-tcpm-historicize-00

<L.Wood@surrey.ac.uk> Sat, 26 June 2010 18:31 UTC

Return-Path: <L.Wood@surrey.ac.uk>
X-Original-To: tcpm@core3.amsl.com
Delivered-To: tcpm@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 96F4F3A692C for <tcpm@core3.amsl.com>; Sat, 26 Jun 2010 11:31:07 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.941
X-Spam-Level:
X-Spam-Status: No, score=-4.941 tagged_above=-999 required=5 tests=[AWL=0.169, BAYES_05=-1.11, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 9wXLZNfhYD2d for <tcpm@core3.amsl.com>; Sat, 26 Jun 2010 11:31:04 -0700 (PDT)
Received: from mail72.messagelabs.com (mail72.messagelabs.com [193.109.255.147]) by core3.amsl.com (Postfix) with ESMTP id 9A4813A68BD for <tcpm@ietf.org>; Sat, 26 Jun 2010 11:31:03 -0700 (PDT)
X-VirusChecked: Checked
X-Env-Sender: L.Wood@surrey.ac.uk
X-Msg-Ref: server-6.tower-72.messagelabs.com!1277577071!3734123!1
X-StarScan-Version: 6.2.4; banners=-,-,-
X-Originating-IP: [131.227.200.35]
Received: (qmail 6107 invoked from network); 26 Jun 2010 18:31:11 -0000
Received: from unknown (HELO EXHT021P.surrey.ac.uk) (131.227.200.35) by server-6.tower-72.messagelabs.com with AES128-SHA encrypted SMTP; 26 Jun 2010 18:31:11 -0000
Received: from EXMB01CMS.surrey.ac.uk ([169.254.1.69]) by EXHT021P.surrey.ac.uk ([131.227.200.35]) with mapi; Sat, 26 Jun 2010 19:31:11 +0100
From: L.Wood@surrey.ac.uk
To: touch@isi.edu
Date: Sat, 26 Jun 2010 19:31:10 +0100
Thread-Topic: [tcpm] draft-eggert-tcpm-historicize-00
Thread-Index: AcsVXcBLy3EU/jyZQk+zxivp0VFeXw==
Message-ID: <9858250F-96F1-427A-A211-004A9FE3FE82@surrey.ac.uk>
References: <20100609151532.8E75E28C0D0@core3.amsl.com><33D3BDE9-7E8D-4DF0-B8D5-BFFC66CF9C99@nokia.com><2262C708-DF9A-4DD9-9378-D84C5AF330AC@nokia.com><C304DB494AC0C04C87C6A6E2FF5603DB48105A5A82@NDJSSCC01.ndc.nasa.gov> <0C53DCFB700D144284A584F54711EC580A0CE306@xmb-sjc-21c.amer.cisco.com> <5FDC413D5FA246468C200652D63E627A0935F804@LDCMVEXC1-PRD.hq.netapp.com> <4C24F73F.9060402@isi.edu> <87877965-FBC4-4A67-917A-EB48BED2CBB1@surrey.ac.uk> <4C26424F.3020005@isi.edu>
In-Reply-To: <4C26424F.3020005@isi.edu>
Accept-Language: en-US, en-GB
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
acceptlanguage: en-US, en-GB
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
Cc: tcpm@ietf.org, ananth@cisco.com, Anumita.Biswas@netapp.com, L.Wood@surrey.ac.uk
Subject: Re: [tcpm] draft-eggert-tcpm-historicize-00
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 26 Jun 2010 18:31:08 -0000

Joe,

can you at least read the cited papers before making claims about them?

>From the abstract of Stone's SIGCOMM paper:

Traces of Internet packets from the past two years show that between 1 packet in 1,100 and 1 packet in 32,000 fails the TCP checksum, even on links where link-level CRCs should catch all but 1 in 4 billion errors. 
[..]
Much of our time has been spent negotiating access to data.

s/*can*/*was observed to over two years in the wild*.

TCP is supposed to provide a *reliable* transfer to application writers. With jumbo frames and the one's complement checksum, that is no longer true... and application writers don't necessarily have control over frame sizes.

Needs fixing imo.

On 26 Jun 2010, at 19:09, Joe Touch wrote:

> L.Wood@surrey.ac.uk wrote:
>> Ethernet uses CRC-32, TCP's existing 16-bit checksum
>> isn't a CRC. So I'm a bit puzzled by the question.
>> 
>> The link CRC and TCP checksum can disagree. See
> 
> *can*. The paper below did mathematical analyses, but didn't observe them either
> in a lab or in the wild. I.e., that work didn't see places where it *did*. Thus
> my question.
> 
> I'm not asking whether this is important to address, just what the current
> motivation is. The work below suggests using application-layer checksums. That
> seems prudent for large transfers, but I don't see a motivation for needing this
> fixed at the TCP layer...
> 
> Joe
> 
>> Stone, J., Greenwald, M., Hughes, J., and C. Partridge,
>> "Performance of checksums and CRCs over real data", IEEE
>> Transactions on Networks vol. 6 issue 5, pp. 529-543,
>> October 1998.
>> 
>> Stone, J. and C. Partridge, "When the CRC and TCP Checksum
>> Disagree", Proceedings of ACM SIGCOMM , September 2000.
>> 
>> 
>> On 25 Jun 2010, at 19:36, Joe Touch wrote:
>>>> Our expirience has shown, that TCP often runs over links with much
>>>> higher error rates, than intended for the technology (ie. Design goal of
>>>> BER <= 1E-12 over Ethernet often has an actual BER >> 1E-12). If such
>>>> error-prone links are running at a high speed and with large segment
>>>> sizes (jumbo / giant frames), there exists the real problem that within
>>>> a short timeframe (weeks to months), TCP will deliver corrupted data, as
>>>> CRC-16 was not able to catch the error.
>>> 
>>> Just to confirm, is this a measured problem or a mathematical one? I.e., have
>>> you seen this actually occur (i.e., TCP with invalid CRC-16 but valid checksum)
>>> in the wild, in a lab, or have you never seen it but consider it important anyway?
>> 
> 

Lloyd Wood
L.Wood@surrey.ac.uk
http://sat-net.com/L.Wood