Re: [tcpm] draft-eggert-tcpm-historicize-00

Joe Touch <touch@isi.edu> Thu, 01 July 2010 17:22 UTC

Return-Path: <touch@isi.edu>
X-Original-To: tcpm@core3.amsl.com
Delivered-To: tcpm@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 9EAEA3A697D for <tcpm@core3.amsl.com>; Thu, 1 Jul 2010 10:22:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.339
X-Spam-Level:
X-Spam-Status: No, score=-2.339 tagged_above=-999 required=5 tests=[AWL=0.260, BAYES_00=-2.599]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id z8RoujntLxvg for <tcpm@core3.amsl.com>; Thu, 1 Jul 2010 10:22:44 -0700 (PDT)
Received: from vapor.isi.edu (vapor.isi.edu [128.9.64.64]) by core3.amsl.com (Postfix) with ESMTP id DB6ED3A659A for <tcpm@ietf.org>; Thu, 1 Jul 2010 10:22:41 -0700 (PDT)
Received: from [75.214.235.123] (123.sub-75-214-235.myvzw.com [75.214.235.123]) (authenticated bits=0) by vapor.isi.edu (8.13.8/8.13.8) with ESMTP id o61HK5kb020539 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT); Thu, 1 Jul 2010 10:20:17 -0700 (PDT)
Message-ID: <4C2CCE43.4000500@isi.edu>
Date: Thu, 01 Jul 2010 10:20:03 -0700
From: Joe Touch <touch@isi.edu>
User-Agent: Thunderbird 2.0.0.24 (Windows/20100228)
MIME-Version: 1.0
To: "Scheffenegger, Richard" <rs@netapp.com>
References: <A3D02FB7C6883741952C425A59E261A50973253A@SACMVEXC2-PRD.hq.netapp.com> <4C27BBD2.4060002@isi.edu> <5FDC413D5FA246468C200652D63E627A0639E4E9@LDCMVEXC1-PRD.hq.netapp.com>
In-Reply-To: <5FDC413D5FA246468C200652D63E627A0639E4E9@LDCMVEXC1-PRD.hq.netapp.com>
X-Enigmail-Version: 0.96.0
Content-Type: multipart/signed; micalg="pgp-sha1"; protocol="application/pgp-signature"; boundary="------------enigA223FF7D3D019B4841079097"
X-ISI-4-43-8-MailScanner: Found to be clean
X-MailScanner-From: touch@isi.edu
Cc: tcpm@ietf.org, ananth@cisco.com, L.Wood@surrey.ac.uk, "Biswas, Anumita" <Anumita.Biswas@netapp.com>
Subject: Re: [tcpm] draft-eggert-tcpm-historicize-00
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 01 Jul 2010 17:22:54 -0000


Scheffenegger, Richard wrote:
...
>>> The ideal way to do this would be to have Ethernet frames include the
>>> CRC32C checksum instead of the CRC32 checksum. But changing the Ethernet
>>> standard across the board is impossible due to backward compatibility
>>> issues.
>>
>> That's no more true for ethernet than for TCP.
>
> I beg to differ.
>  
> 802.3 Ethernet has no way of negotiating the type of CRC used - it's
> always been CRC32, and probably will always be CRC32 - for the simple
> fact that MTU1500 frames can potentially be bridged over legacy 10-Mbit
> half-duplex Hubs, even if they originated from state-of-the-art, 40G
> Ethernet Interfaces.

We're talking about jumbograms, which, according to other posts, is not yet
standardized. This presents a fine opportunity to standardize them with a new CRC.

That should not affect legacy MTU1500 frames.

...
> My understanding from the example Anumita mentioned (the one customer
> copying a few TB around, only to find a few bits flipped) is, that an
> application level checksum (CRC32C for each 8k appliaction data block)
> was used to find those problems. However, the damage was already done.

Where was the damage done?

>  
> So, to answer your question
>  
>> Just to confirm, is this a measured problem or a mathematical one?
> I.e., have
>> you seen this actually occur (i.e., TCP with invalid CRC-16 but valid
> checksum)
>> in the wild, in a lab, or have you never seen it but consider it
> important anyway?
> 
> the answer would be in the wild. I agree with you, that there are
> probably many other sources of corruption (bit flip in the tcp receive /
> send buffer, before CRC is ever calculated; after all, ECC Memory is
> also not perfect (3-bit errors  can go unnoticed)... However, raising
> the confidence levels of the TCP layer over known problematic links (BER
>>> 1E-12) would help...

It provably does not help unless the errors occur at the link layer. Seeing
faulty application transfers does not indicate that a link solution would help.

>> An interesting question is why you care that the payload doesn't get
>> accidentally altered, but you don't care if the port or address does
> (and you
>> don't notice), though.
> As mentioned, providing payload data integrity for the stream would be
> the goal. If a packet's header is corrupted (but goes undetected), that
> stray segment will cause no harm (or at least with many orders of
> magnitude lower pobability); for the stream where that packet came from,
> it's just look like any other dropped frame; 

Yes, but it could easily (and silently) corrupt the data of other streams. That
does NOT seem like a reasonable trade-off for a transport protocol protection
mechanism.

> however, if the integrity
> of the data (especially of jumbo (9k-16k) or giant (32k-64k) frames can
> not be guaranteed by L2 (ethernet) checksums under adverse conditions,
> and also TCP does not provide a means for it, major forklift upgrade
> would be required - commercial software doing L5+ checksums and
> retransmissions on top of TCP.

If you can't protect against faulty data going into other connections, you will
need such a forklift upgrade anyway.

...
> Also, from a architectural point of view, I can agree that changing the
> CRC32 of Ethernet to CRC32C or CRC64 might have been (!) the proper way
> in the time when 1 Gigabit was standardized (and with it, the support
> for Jumbo Frames). But we passed that milestone a decade ago, unfortunately.
>  
> Do we have a member here who is also active in 802.3ba (HSSG), and who
> could comment on the feasability of changing the CRC in Ethernet?

AFAICT, if jumbograms aren't yet standardized, it's definitely feasible.

E.g., let CRC be something that depends on the payload length; if length >1500,
use a different CRC.

Joe