Re: [tcpm] draft-eggert-tcpm-historicize-00

Joe Touch <touch@isi.edu> Sat, 26 June 2010 18:10 UTC

Return-Path: <touch@isi.edu>
X-Original-To: tcpm@core3.amsl.com
Delivered-To: tcpm@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id B79233A694A for <tcpm@core3.amsl.com>; Sat, 26 Jun 2010 11:10:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -0.974
X-Spam-Level:
X-Spam-Status: No, score=-0.974 tagged_above=-999 required=5 tests=[AWL=-0.975, BAYES_50=0.001]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id WlHNalZm7UxS for <tcpm@core3.amsl.com>; Sat, 26 Jun 2010 11:10:08 -0700 (PDT)
Received: from nitro.isi.edu (nitro.isi.edu [128.9.208.207]) by core3.amsl.com (Postfix) with ESMTP id 62E4E3A6978 for <tcpm@ietf.org>; Sat, 26 Jun 2010 11:10:08 -0700 (PDT)
Received: from [192.168.1.92] (pool-71-106-88-10.lsanca.dsl-w.verizon.net [71.106.88.10]) (authenticated bits=0) by nitro.isi.edu (8.13.8/8.13.8) with ESMTP id o5QI9JkT020198 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Sat, 26 Jun 2010 11:09:35 -0700 (PDT)
Message-ID: <4C26424F.3020005@isi.edu>
Date: Sat, 26 Jun 2010 11:09:19 -0700
From: Joe Touch <touch@isi.edu>
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
To: L.Wood@surrey.ac.uk
References: <20100609151532.8E75E28C0D0@core3.amsl.com><33D3BDE9-7E8D-4DF0-B8D5-BFFC66CF9C99@nokia.com><2262C708-DF9A-4DD9-9378-D84C5AF330AC@nokia.com><C304DB494AC0C04C87C6A6E2FF5603DB48105A5A82@NDJSSCC01.ndc.nasa.gov> <0C53DCFB700D144284A584F54711EC580A0CE306@xmb-sjc-21c.amer.cisco.com> <5FDC413D5FA246468C200652D63E627A0935F804@LDCMVEXC1-PRD.hq.netapp.com> <4C24F73F.9060402@isi.edu> <87877965-FBC4-4A67-917A-EB48BED2CBB1@surrey.ac.uk>
In-Reply-To: <87877965-FBC4-4A67-917A-EB48BED2CBB1@surrey.ac.uk>
X-Enigmail-Version: 0.96.0
Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; boundary="------------enig9E927AE6E8FE9E8A0F93003A"
X-MailScanner-ID: o5QI9JkT020198
X-ISI-4-69-MailScanner: Found to be clean
X-MailScanner-From: touch@isi.edu
Cc: tcpm@ietf.org, ananth@cisco.com, Anumita.Biswas@netapp.com
Subject: Re: [tcpm] draft-eggert-tcpm-historicize-00
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 26 Jun 2010 18:10:15 -0000


L.Wood@surrey.ac.uk wrote:
> Ethernet uses CRC-32, TCP's existing 16-bit checksum
> isn't a CRC. So I'm a bit puzzled by the question.
> 
> The link CRC and TCP checksum can disagree. See

*can*. The paper below did mathematical analyses, but didn't observe them either
in a lab or in the wild. I.e., that work didn't see places where it *did*. Thus
my question.

I'm not asking whether this is important to address, just what the current
motivation is. The work below suggests using application-layer checksums. That
seems prudent for large transfers, but I don't see a motivation for needing this
fixed at the TCP layer...

Joe

> Stone, J., Greenwald, M., Hughes, J., and C. Partridge,
> "Performance of checksums and CRCs over real data", IEEE
> Transactions on Networks vol. 6 issue 5, pp. 529-543,
> October 1998.
> 
> Stone, J. and C. Partridge, "When the CRC and TCP Checksum
> Disagree", Proceedings of ACM SIGCOMM , September 2000.
> 
> 
> On 25 Jun 2010, at 19:36, Joe Touch wrote:
>>> Our expirience has shown, that TCP often runs over links with much
>>> higher error rates, than intended for the technology (ie. Design goal of
>>> BER <= 1E-12 over Ethernet often has an actual BER >> 1E-12). If such
>>> error-prone links are running at a high speed and with large segment
>>> sizes (jumbo / giant frames), there exists the real problem that within
>>> a short timeframe (weeks to months), TCP will deliver corrupted data, as
>>> CRC-16 was not able to catch the error.
>>
>> Just to confirm, is this a measured problem or a mathematical one? I.e., have
>> you seen this actually occur (i.e., TCP with invalid CRC-16 but valid checksum)
>> in the wild, in a lab, or have you never seen it but consider it important anyway?
>