RE: [Tsvwg] SCTP and Checksums

"Douglas Otis" <dotis@sanlight.net> Wed, 16 May 2001 15:22 UTC

Received: from optimus.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA11256 for <tsvwg-archive@odin.ietf.org>; Wed, 16 May 2001 11:22:23 -0400 (EDT)
Received: from optimus.ietf.org (localhost [127.0.0.1]) by optimus.ietf.org (8.9.1a/8.9.1) with ESMTP id LAA06311; Wed, 16 May 2001 11:02:11 -0400 (EDT)
Received: from ietf.org (odin [132.151.1.176]) by optimus.ietf.org (8.9.1a/8.9.1) with ESMTP id LAA06278 for <tsvwg@ns.ietf.org>; Wed, 16 May 2001 11:02:03 -0400 (EDT)
Received: from c007.snv.cp.net ([209.228.33.214]) by ietf.org (8.9.1a/8.9.1a) with SMTP id LAA10924 for <tsvwg@ietf.org>; Wed, 16 May 2001 11:01:54 -0400 (EDT)
Received: (cpmta 15975 invoked from network); 16 May 2001 07:37:38 -0700
Received: from unknown (HELO ljoy) (64.130.130.105) by smtp.telocity.com (209.228.33.214) with SMTP; 16 May 2001 07:37:38 -0700
X-Sent: 16 May 2001 14:37:38 GMT
From: Douglas Otis <dotis@sanlight.net>
To: "Randall R. Stewart" <randall@stewart.chicago.il.us>, "WENDT,JIM (HP-Roseville,ex1)" <jim_wendt@hp.com>
Cc: Black_David@emc.com, tsvwg@ietf.org, Craig Partridge <craig@aland.bbn.com>, Jonathan Stone <jonathan@dsg.stanford.edu>
Subject: RE: [Tsvwg] SCTP and Checksums
Date: Wed, 16 May 2001 07:35:18 -0700
Message-ID: <NEBBJGDMMLHHCIKHGBEJAEJECHAA.dotis@sanlight.net>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3 (Normal)
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook IMO, Build 9.0.2416 (9.0.2911.0)
Importance: Normal
In-Reply-To: <3B01CFB9.4038F3F@stewart.chicago.il.us>
X-MimeOLE: Produced By Microsoft MimeOLE V5.50.4522.1200
Content-Transfer-Encoding: 7bit
Sender: tsvwg-admin@ietf.org
Errors-To: tsvwg-admin@ietf.org
X-Mailman-Version: 1.0
Precedence: bulk
List-Id: Transport Area Working Group <tsvwg.ietf.org>
X-BeenThere: tsvwg@ietf.org
Content-Transfer-Encoding: 7bit

All,

I have completed the half packet tests that now include a modified Adler
with a 16 bit add and a modified Fletcher that uses a 65535 modulo.

See errchk_half.c and errchk_half.exe at:
ftp://ftp.sanlight.net/pub/

The Modified Adler as recommended by Jonathan, has the blind codes of:

s1+testbuf[i] = 65535 or 14
s1+testbuf[i] = 65534 or 13
s1+testbuf[i] = 65533 or 12
s1+testbuf[i] = 65532 or 11
s1+testbuf[i] = 65531 or 10
s1+testbuf[i] = 65530 or 9
s1+testbuf[i] = 65529 or 8
s1+testbuf[i] = 65528 or 7
s1+testbuf[i] = 65527 or 6
s1+testbuf[i] = 65526 or 5
s1+testbuf[i] = 65525 or 4
s1+testbuf[i] = 65524 or 3
s1+testbuf[i] = 65523 or 2
s1+testbuf[i] = 65522 or 1
s1+testbuf[i] = 65521 or 0

As well as some additional blind codes due to the partial pipeline of the
Adler modulo.

for (i = 0; i < TBUFSIZ; i++)
  {
  s1 += testbuf[i];
  if (s1 >= ADLER_MODULO)
    {
    s1 += 15;
    s1 &= 0xffff;
    }
  s2 += s1;
  if (s2 >= ADLER_MODULO)
    {
    s2 += 15;
    s2 &= 0xffff;
    }

The modified Fletcher as recommended by Jonathan, but not as described in
RFC1146, has the blind codes of:

s1+testbuf[i] = 65535 or 0

for (i = 0; i < TBUFSIZ; i++)
  {
  s1 += testbuf[i];
  if (s1 >= 0xffff)
    {
    s1++;
    s1 &= 0xffff;
    }
  s2 += s1;
  }

Perhaps to be expected, the Fletcher code had a 1.3% failure rate instead of
2%.  The Adler-32, Modified Adler, and Modified Fletcher had zero failures
to detect in the same number of passes as before or around 28 million.  To
get relative error rate information, defects will need to be made in a more
aggressive manner or shorter packets will need to be investigated.  Note,
none of these algorithms, with their bind codes or byte adds, are robust
with respect to burst errors.  Moving off of the binary modulo offers a
significant advantage with respect to stuck bit errors however.  I also
found an error in the prior Adler-32 routine in that I used short rather
than int reducing the application of the modulo.

Doug


> Jim:
>
> I think this is an excellent point... From the papers I gather
> that packets in the "Wild" tend to exhibit this type of huge
> numbers of bits 1000-2000 bits at a time... to the end of the
> packet per chance... :)
>
> Doug, can you re-run your tests with this type of results i.e.
> the packet gets corrupted from the middle all the way to the
> end with a complete copy error... say the last part of the
> packet is mutilated...
>
> R
>
> "WENDT,JIM (HP-Roseville,ex1)" wrote:
> >
> > Doug,
> >
> > > Burst errors should already be guarded by media CRC.
> >
> > What do you mean by "media CRC"?  Is this the link level CRC on
> Ethernet (or
> > other) frames?
> > My impression from Jonathan and Craig's paper is that a variety
> of software
> > and hardware errors can (and do) occur within routers and end nodes that
> > appear as large burst errors. What is the performance of the various
> > checksum and CRC algorithms when errors with large hamming
> distances (1000
> > or 2000 bits) are involved?
> >
> > Jim
> >
> > -----
> > Doug Otis writes:
> > > > > All,
> > > > >
> > > > > I made a quick simulation of a stuck bit error and the ability of
> > either
> > > > > Fletcher and Alder-Fletcher to detect this problem.
> Think of this as
> > a
> > > > > memory cell within a router with a defective driver.  The
> results were
> > > > > surprising so perhaps I made a mistake.
> > > > >
> > > > > See: Errchk.c and Errchk.exe  (Windows compatible)
> > > > > ftp://ftp.sanlight.net/pub/
> > > > >
> > > > > The results indicated this significant error could not be
> > > > > detected 2% of the
> > > > > time using Fletcher and yet the Adler-Fletcher algorithm failed
> > detectio
> > > n
> > > > > .1% of the time for a 20 times improvement.  The simulation code
> > > > > plays a few
> > > > > tricks.  The SCTP header ensures any stuck bit induces an error.
> > > > >
> > > > > The method of ensuring the Adler modulo used an integer with the
> > > > > C code but
> > > > > this could be done using a 16 bit register if written in
> assembly to
> > tak
> > > e
> > > > > advantage of the carry flag.  This would avoid the modulo
> > > > > comparison in most
> > > > > cases and would actually be faster.  The modulo technique
> should not
> > be
> > > > > viewed as more instructions to that of the traditional technique.
> > > > >  A 16 bit
> > > > > add can exceed the modulo, the addition technique with a 16 bit
> > register
> > > > > ensures the upper sum remains within a Adler modulo.
> > > > >
> > > > > I'll play around with other error techniques, but I
> thought this was
> > > > > significant to provide this information sooner than later.
> > > > >
> > > > > Doug
> > > > >
> > > > > _______________________________________________
> >
> > _______________________________________________
> > tsvwg mailing list
> > tsvwg@ietf.org
> > http://www1.ietf.org/mailman/listinfo/tsvwg
>
> --
> Randall R. Stewart
> randall@stewart.chicago.il.us or rrs@cisco.com
> 815-342-5222 (cell) 815-477-2127 (work)
>
> _______________________________________________
> tsvwg mailing list
> tsvwg@ietf.org
> http://www1.ietf.org/mailman/listinfo/tsvwg
>


_______________________________________________
tsvwg mailing list
tsvwg@ietf.org
http://www1.ietf.org/mailman/listinfo/tsvwg