Re: [tcpm] draft-eggert-tcpm-historicize-00

"Scheffenegger, Richard" <rs@netapp.com> Thu, 01 July 2010 11:08 UTC

Return-Path: <rs@netapp.com>
X-Original-To: tcpm@core3.amsl.com
Delivered-To: tcpm@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id BA3073A6870 for <tcpm@core3.amsl.com>; Thu, 1 Jul 2010 04:08:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.765
X-Spam-Level:
X-Spam-Status: No, score=-3.765 tagged_above=-999 required=5 tests=[AWL=-1.163, BAYES_50=0.001, HTML_MESSAGE=0.001, MIME_QP_LONG_LINE=1.396, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id GnElx22GFTbh for <tcpm@core3.amsl.com>; Thu, 1 Jul 2010 04:08:02 -0700 (PDT)
Received: from mx4.netapp.com (mx4.netapp.com [217.70.210.8]) by core3.amsl.com (Postfix) with ESMTP id 51D1B3A684B for <tcpm@ietf.org>; Thu, 1 Jul 2010 04:08:01 -0700 (PDT)
X-IronPort-AV: E=Sophos; i="4.53,519,1272870000"; d="scan'208,217"; a="180181279"
Received: from smtp3.europe.netapp.com ([10.64.2.67]) by mx4-out.netapp.com with ESMTP; 01 Jul 2010 04:08:11 -0700
Received: from ldcrsexc2-prd.hq.netapp.com (emeaexchrs.hq.netapp.com [10.65.251.110]) by smtp3.europe.netapp.com (8.13.1/8.13.1/NTAP-1.6) with ESMTP id o61B4wQp008584; Thu, 1 Jul 2010 04:07:35 -0700 (PDT)
Received: from LDCMVEXC1-PRD.hq.netapp.com ([10.65.251.107]) by ldcrsexc2-prd.hq.netapp.com with Microsoft SMTPSVC(6.0.3790.3959); Thu, 1 Jul 2010 12:07:24 +0100
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01CB190D.9582FE04"
Date: Thu, 01 Jul 2010 12:07:24 +0100
Message-ID: <5FDC413D5FA246468C200652D63E627A0639E4E9@LDCMVEXC1-PRD.hq.netapp.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [tcpm] draft-eggert-tcpm-historicize-00
Thread-Index: AcsWO+1VD/qTtjLWT1OLtxbEUM+5EQCy5ObX
References: <A3D02FB7C6883741952C425A59E261A50973253A@SACMVEXC2-PRD.hq.netapp.com> <4C27BBD2.4060002@isi.edu>
From: "Scheffenegger, Richard" <rs@netapp.com>
To: Joe Touch <touch@isi.edu>, "Biswas, Anumita" <Anumita.Biswas@netapp.com>
X-OriginalArrivalTime: 01 Jul 2010 11:07:24.0882 (UTC) FILETIME=[95CED320:01CB190D]
Cc: tcpm@ietf.org, ananth@cisco.com, L.Wood@surrey.ac.uk
Subject: Re: [tcpm] draft-eggert-tcpm-historicize-00
X-BeenThere: tcpm@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: TCP Maintenance and Minor Extensions Working Group <tcpm.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/tcpm>
List-Post: <mailto:tcpm@ietf.org>
List-Help: <mailto:tcpm-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tcpm>, <mailto:tcpm-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 01 Jul 2010 11:08:03 -0000

Hi Joe,
 
> Note that those vendors *do* deploy new capabilities - e.g., see the IETF TRILL WG.

IETF TRILL is not a good example, as that the top side of L2 (management protocol of the topology). 
 
 
>>> that such data centers need different *ethernet* checksums,
>>> not different TCP ones. Then the data centers can leverage -
>>> and reuse - that where necessary.
>>
>> The ideal way to do this would be to have Ethernet frames include the
>> CRC32C checksum instead of the CRC32 checksum. But changing the Ethernet
>> standard across the board is impossible due to backward compatibility
>> issues.
>
> That's no more true for ethernet than for TCP.

I beg to differ.
 
802.3 Ethernet has no way of negotiating the type of CRC used - it's always been CRC32, and probably will always be CRC32 - for the simple fact that MTU1500 frames can potentially be bridged over legacy 10-Mbit half-duplex Hubs, even if they originated from state-of-the-art, 40G Ethernet Interfaces. 
 
In comparison to that, TCP *does have* (and always had) the ability to negotiate new features - and thereby maintain backwards compatibility.
 
My understanding from the example Anumita mentioned (the one customer copying a few TB around, only to find a few bits flipped) is, that an application level checksum (CRC32C for each 8k appliaction data block) was used to find those problems. However, the damage was already done.
 
So, to answer your question
 
> Just to confirm, is this a measured problem or a mathematical one? I.e., have
> you seen this actually occur (i.e., TCP with invalid CRC-16 but valid checksum)
> in the wild, in a lab, or have you never seen it but consider it important anyway?
 
the answer would be in the wild. I agree with you, that there are probably many other sources of corruption (bit flip in the tcp receive / send buffer, before CRC is ever calculated; after all, ECC Memory is also not perfect (3-bit errors  can go unnoticed)... However, raising the confidence levels of the TCP layer over known problematic links (BER >> 1E-12) would help...
 
> An interesting question is why you care that the payload doesn't get
> accidentally altered, but you don't care if the port or address does (and you
> don't notice), though.

As mentioned, providing payload data integrity for the stream would be the goal. If a packet's header is corrupted (but goes undetected), that stray segment will cause no harm (or at least with many orders of magnitude lower pobability); for the stream where that packet came from, it's just look like any other dropped frame; however, if the integrity of the data (especially of jumbo (9k-16k) or giant (32k-64k) frames can not be guaranteed by L2 (ethernet) checksums under adverse conditions, and also TCP does not provide a means for it, major forklift upgrade would be required - commercial software doing L5+ checksums and retransmissions on top of TCP.
 
As Anumita pointed out, tcp segment checksums can be offloaded to hardware relatively easy (provided there is a standard for it); it might not be the perfect solution, but the most effective one to deploy incrementially. 
 
Also, from a architectural point of view, I can agree that changing the CRC32 of Ethernet to CRC32C or CRC64 might have been (!) the proper way in the time when 1 Gigabit was standardized (and with it, the support for Jumbo Frames). But we passed that milestone a decade ago, unfortunately.
 
Do we have a member here who is also active in 802.3ba (HSSG), and who could comment on the feasability of changing the CRC in Ethernet? 
 
Regards,
  Richard
________________________________

Von: Joe Touch [mailto:touch@isi.edu]
Gesendet: So 27.06.2010 23:00
An: Biswas, Anumita
Cc: tcpm@ietf.org; ananth@cisco.com; L.Wood@surrey.ac.uk
Betreff: Re: [tcpm] draft-eggert-tcpm-historicize-00



Hi, Animita,

Biswas, Anumita wrote:
>> Everything so far points to the need for an application
>> protocol check, i.e., a final, total transfer checksum on
>> each large transaction (i.e., the whole file). That was one
>> conclusion of the 2000 paper, FWIW.
>>
> I mentioned the advantage of being able to offload a TCP based checksum
> to the NIC. If I were to extend your argument that end system software
> is more likely the problem - a checksum over the whole file could also
> be prone to error.

That's true too, but it would catch errors at all other places in the stack,
e.g., even errors that would not be caught by errors actually seen (e.g., in the
2000 paper).

> I think we are only trying to deploy a "stronger" checksum - we are not
> making claims that it will find all problems in end to end software -
> and as such is not the strongest checksum.

My point above is that 'stronger' won't fix the flaws actually seen thus far.

...
> Please read why the Castagnoli polynomial based CRC checksum is stronger
> than the Ethernet CRC32 checksum carried by the Ethernet frame.

I have no question about that. This is a great argument to raise to the Ethernet
community.

> Your argument suggests
>> that such data centers need different *ethernet* checksums,
>> not different TCP ones. Then the data centers can leverage -
>> and reuse - that where necessary.
>
> The ideal way to do this would be to have Ethernet frames include the
> CRC32C checksum instead of the CRC32 checksum. But changing the Ethernet
> standard across the board is impossible due to backward compatibility
> issues.

That's no more true for ethernet than for TCP.

> But TCP options provides a much easier path to introducing a
> stronger checksum as it does not break backward compatibility and can be
> offloaded to the NIC as well.

That may all be true, but as I've noted repeatedly, the errors *actually seen*
thus far would not be solved by a new, stronger TCP sum. Only an application sum
would have caught those errors.

> Data Centers are one place where Jumbo frames are deployed - but there
> could be others. So I don't understand what you mean by "data centers
> can leverage and reuse that where necessary".

I'm suggesting that this is a jumboframe problem. Fix it in the implementation
of jumboframes, and all users of jumboframes will eventually benefit - i.e., fix
this at the Ethernet vendor level, and it can be used in all data centers that
use jumboframes.

Note that those vendors *do* deploy new capabilities - e.g., see the IETF TRILL WG.

Joe