dataplane encapsulation considerations - checksums

Stewart Bryant <stbryant@cisco.com> Wed, 10 December 2014 16:19 UTC

Return-Path: <stbryant@cisco.com>
X-Original-To: routing-discussion@ietfa.amsl.com
Delivered-To: routing-discussion@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C9C841A86EF for <routing-discussion@ietfa.amsl.com>; Wed, 10 Dec 2014 08:19:35 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -14.511
X-Spam-Level:
X-Spam-Status: No, score=-14.511 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Uqfy37laQg-k for <routing-discussion@ietfa.amsl.com>; Wed, 10 Dec 2014 08:19:33 -0800 (PST)
Received: from aer-iport-2.cisco.com (aer-iport-2.cisco.com [173.38.203.52]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id A7EA21A7030 for <routing-discussion@ietf.org>; Wed, 10 Dec 2014 08:19:32 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=6650; q=dns/txt; s=iport; t=1418228372; x=1419437972; h=message-id:date:from:reply-to:mime-version:to:subject: references:in-reply-to:content-transfer-encoding; bh=m16Cx8jHr4AIyesydPGaDSujY6fTwrIH9TIQNA2txYc=; b=fH6xbDJCAypsA34bJd8ipPaJ1LJLj4Dv1P2Pe1BG4LeJTMbh0y75u8VN 9+JKk9xjvXyXJpUnqtBFm6IH8iuiJKc1lxHhIDEg94ZMIbgI9LuijSGIa KNqgu2zu46QLGyNyNRGJ4jTDLLN0j+BWoCKYSmckUgbJXgc+BW0FhkT/+ U=;
X-IronPort-AV: E=Sophos;i="5.07,553,1413244800"; d="scan'208";a="268493235"
Received: from aer-iport-nat.cisco.com (HELO aer-core-3.cisco.com) ([173.38.203.22]) by aer-iport-2.cisco.com with ESMTP; 10 Dec 2014 16:19:30 +0000
Received: from [10.61.99.142] (dhcp-10-61-99-142.cisco.com [10.61.99.142]) by aer-core-3.cisco.com (8.14.5/8.14.5) with ESMTP id sBAGJUN2030728; Wed, 10 Dec 2014 16:19:30 GMT
Message-ID: <5488728F.5030003@cisco.com>
Date: Wed, 10 Dec 2014 16:19:27 +0000
From: Stewart Bryant <stbryant@cisco.com>
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:24.0) Gecko/20100101 Thunderbird/24.6.0
MIME-Version: 1.0
To: l.wood@surrey.ac.uk, akatlas@gmail.com, routing-discussion@ietf.org
Subject: dataplane encapsulation considerations - checksums
References: <CAG4d1rd60hK8=WtYw-nid_Z7Z8+TvdzA52fNx3pFjND+eDWAfA@mail.gmail.com>, <54877D58.9050002@cisco.com> <DB4PR06MB457F278EAF9C84BCA20E665AD620@DB4PR06MB457.eurprd06.prod.outlook.com>, <54883947.8000302@cisco.com> <1418217891573.89120@surrey.ac.uk>
In-Reply-To: <1418217891573.89120@surrey.ac.uk>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: http://mailarchive.ietf.org/arch/msg/routing-discussion/CejBEW3bcb-zlXtwLcLGWb_iZss
X-BeenThere: routing-discussion@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
Reply-To: stbryant@cisco.com
List-Id: Routing Area General mailing list <routing-discussion.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/routing-discussion>, <mailto:routing-discussion-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/routing-discussion/>
List-Post: <mailto:routing-discussion@ietf.org>
List-Help: <mailto:routing-discussion-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/routing-discussion>, <mailto:routing-discussion-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 10 Dec 2014 16:19:36 -0000

Changing the thread title to match the subject

Lloyd

Those numbers are orders of magnitude larger than anything I have
seen elsewhere.

If we look at them in detail they are interesting because if the no ports
are c/s errors as you hypothesis, then only 1/3 of the errors are seemingly
caught by the c/s. I would thus expect  you to be arguing strongly
that we need to deprecate the existing c/s in favour of something
much stronger such as fletcher, or ideally a crypto checksum.
In the case of the link state IGPs we have operators that turn on
link state security not for security but for the enhanced checksum
and that seems to be what you need here in the host stacks.

However we have another study that is worth looking at:

https://www.verisigninc.com/assets/VRSN_Bitsquatting_TR_20120320.pdf

This looks at the UDP c/s errors received by DNS servers, and they
saw an error rate of 1 in 10^5. However if you read the detail there are 
some
systematic effects going on, with a significant fraction seemingly due
to transmit host stack problems which may be what you are seeing.

Now I think all this points to two things, firstly from a perspective
of a host the c/s looks inadequate and there may need to be a
revision of the host stack.

On the other hand the reported error rate for UDP that the paper
reports is an over estimate of the error rate in the tunnel case since
only an error in the IP and UDP header can cause misdelivery
with all other errors reflecting themselves as payload errors which
the payload error protection. Although I do note that they report
some systematic effect in terms of bit position which may point either
way.

One other aspect of this, which is important in Routing, if we are seeing
the error rates you imply are getting past the TCP c/s, surely the routing
protocols will be importing long term errors into the routing subsystems
(specifically BGP and LDP). Are people observing that and if so don't we
need to fix it?

- Stewart



On 10/12/2014 13:24, l.wood@surrey.ac.uk wrote:
> Stewart,
>
> Who needs an NMS? Let's go old school.
> TCP and UDP have equivalent pseudo-header+payload ones complement checksums. So data from TCP can be considered as somewhat equivalent as the technology is the same, and TCP is somewhat better instrumented and monitored than UDP is.
>
> Below sample from two core switches, on tcp connections to them  (not, alas, through them) shows e.g. 10287 TCP checksum errors for 4.3 million TCP/IP packets received. That would be an error rate of 0.24% - the other switch is 0.45% . Note the figure of e.g. 19458 no port - entirely possible something inside the core network is trying a port that isn't there, but also possible corruption of port nos. These are firewalled core devices with limited traffic to them, so I would consider this a best case, without throwing edge device data into the mix.
>
> Extrapolating from TCP checksum rates to UDP checksum rates, from traffic to to traffic through, and from v4 to v6 is left as an exercise for the reader.
>
> Incidentally, these rates pretty much match the 1 in 400 observation made early in Stone's SIGCOMM 2000 paper. 1:400 is 10,000:4,000,000
> http://conferences.sigcomm.org/sigcomm/2000/conf/paper/sigcomm2000-9-1.pdf
> Zut alors, quelle surprise, physics is still physics, this particular SIGCOMM paper is still valid.
> Checksums fail because of corruption. But at least there's a checksum to catch the corruption. Sans checksums, the corruption is unnoticed.
>
> The main reason noone has seen misbehaviour with MPLS is because no-one is looking for or instrumenting for it. (Just like MPLS doesn't need TTL, because no-one ever sees MPLS routing loops, right?) Please go measure MPLS and report back.
>
> Traversing over UDP, as suggested in draft-ietf-mpls-in-udp and draft-ietf-tsvwg-gre-in-udp-encap, increases the scope for corruption across a longer path, rather than individual links - and the UDP checksum is an important last check.
>
> SWT1#sh tcp stat
> Rcvd: 4332545 Total, 19458 no port
>        10287 checksum error, 4959 bad offset, 0 too short
>        62946 packets (2289083 bytes) in sequence
>        93 dup packets (19167 bytes)
>        34 partially dup packets (1878 bytes)
>        107 out-of-order packets (17980 bytes)
>        8 packets (8 bytes) with data after window
>        0 packets after close
>        0 window probe packets, 614 window update packets
>        682 dup ack packets, 0 ack packets with unsend data
>        179823 ack packets (37459618 bytes)
> Sent: 2049509 Total, 0 urgent packets
>        1774234 control packets (including 345 retransmitted)
>        249103 data packets (37439059 bytes)
>        308 data packets (41767 bytes) retransmitted
>        46 data packets (4876 bytes) fastretransmitted
>        25718 ack only packets (3764 delayed)
>        0 window probe packets, 104 window update packets
> 9649 Connections initiated, 1230 connections accepted, 10857 connections established
> 1760675 Connections closed (including 30 dropped, 1749812 embryonic dropped)
> 653 Total rxmt timeout, 9 connections dropped in rxmt timeout
> 0 Keepalive timeout, 6 keepalive probe, 0 Connections dropped in keepalive
>
>
> SWT2#sh tcp stat
> Rcvd: 6370948 Total, 37608 no port
>        28419 checksum error, 1749 bad offset, 0 too short
>        181714 packets (3507560 bytes) in sequence
>        130 dup packets (4925 bytes)
>        164 partially dup packets (302 bytes)
>        35 out-of-order packets (278 bytes)
>        64 packets (4549 bytes) with data after window
>        0 packets after close
>        0 window probe packets, 2006 window update packets
>        3865 dup ack packets, 0 ack packets with unsend data
>        857922 ack packets (92503306 bytes)
> Sent: 3793587 Total, 0 urgent packets
>        2667720 control packets (including 215 retransmitted)
>        1042867 data packets (92804839 bytes)
>        1772 data packets (74842 bytes) retransmitted
>        192 data packets (7176 bytes) fastretransmitted
>        81025 ack only packets (15840 delayed)
>        0 window probe packets, 19 window update packets
> 28735 Connections initiated, 5928 connections accepted, 34638 connections established
> 2626485 Connections closed (including 1349 dropped, 2591827 embryonic dropped)
> 1987 Total rxmt timeout, 4 connections dropped in rxmt timeout
> 0 Keepalive timeout, 0 keepalive probe, 0 Connections dropped in keepalive
>
> Lloyd Wood
> http://about.me/lloydwood
>
> error-free modern networking technology? ha.