Re: [mpls] Last Call: <draft-ietf-mpls-in-udp-04.txt> (Encapsulating MPLS in UDP) to Proposed Standard

Curtis Villamizar <curtis@ipv6.occnc.com> Tue, 28 January 2014 01:00 UTC

Return-Path: <curtis@ipv6.occnc.com>
X-Original-To: mpls@ietfa.amsl.com
Delivered-To: mpls@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 70E9E1A037E; Mon, 27 Jan 2014 17:00:58 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.437
X-Spam-Level:
X-Spam-Status: No, score=-2.437 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RP_MATCHES_RCVD=-0.535, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id a5mtqxYZ2gB0; Mon, 27 Jan 2014 17:00:56 -0800 (PST)
Received: from maildrop2.v6ds.occnc.com (maildrop2.v6ds.occnc.com [IPv6:2001:470:88e6:3::232]) by ietfa.amsl.com (Postfix) with ESMTP id 8E8391A0377; Mon, 27 Jan 2014 17:00:56 -0800 (PST)
Received: from harbor3.ipv6.occnc.com (harbor3.v6ds.occnc.com [IPv6:2001:470:88e6:3::239]) (authenticated bits=128) by maildrop2.v6ds.occnc.com (8.14.7/8.14.7) with ESMTP id s0S10jhO097154; Mon, 27 Jan 2014 20:00:45 -0500 (EST) (envelope-from curtis@ipv6.occnc.com)
Message-Id: <201401280100.s0S10jhO097154@maildrop2.v6ds.occnc.com>
To: Joe Touch <touch@isi.edu>
From: Curtis Villamizar <curtis@ipv6.occnc.com>
In-reply-to: Your message of "Mon, 27 Jan 2014 10:53:09 -0800." <52E6AB15.2080907@isi.edu>
Date: Mon, 27 Jan 2014 20:00:45 -0500
Cc: joel jaeggli <joelja@bogus.com>, "mpls@ietf.org" <mpls@ietf.org>, IETF discussion list <ietf@ietf.org>
Subject: Re: [mpls] Last Call: <draft-ietf-mpls-in-udp-04.txt> (Encapsulating MPLS in UDP) to Proposed Standard
X-BeenThere: mpls@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
Reply-To: curtis@ipv6.occnc.com
List-Id: Multi-Protocol Label Switching WG <mpls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mpls>, <mailto:mpls-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/mpls/>
List-Post: <mailto:mpls@ietf.org>
List-Help: <mailto:mpls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mpls>, <mailto:mpls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Jan 2014 01:00:58 -0000

In message <52E6AB15.2080907@isi.edu>
Joe Touch writes:
 
> On 1/27/2014 10:48 AM, joel jaeggli wrote:
> > On 1/27/14, 8:48 AM, Joe Touch wrote:
> >> Those same mechanisms have provided hardware checksum support for a very long time.
> >
> > The new header and the payload are actually in different parts of the
> > forwarding complex until they hit the output queue, you can't checksum
> > data you don't have.
>  
> You can (and some do) the checksum component parts when things go into
> memory; the partial sums can be added as the parts are combined in the
> output queue.
>  
> I appreciate that we're all taking about what might be done, but the
> reality is that there are many 'transparent TCP proxies' that have to
> do this, so there's clearly a solution, and it clearly runs fast
> enough.
>  
> Joe


Joe,

Chips that did 4 x 10 Gb/s are old stuff now but in their day they
pushed silicon limites.  Chips now do N x 100 Gb/s.  Stewart is
describing how these chips (both the prior generation and current) are
architected and you are going back to some idea that there is a common
memory where this all resides and its just a matter of looking at it.
This is not software on a general purpose processor.

If a queue forms, the headers are in SRAM and the body of the packet
is off chip in DRAM.  There isn't enough memory bandwidth to pull the
packet back until its time to send it out.  At that point the header
and body are joined but the header processing has been completed long
ago.

If there is not much of a queue the packet may go into SRAM cache for
the buffer DRAM and go right out of that SRAM.

If there is no queue at all cut-through can happen.  The header gets
transmitted before the entire packet arrives at the input of the
chip.  (And if FCS fails a runt with bad FCS goes out).

At the very least this can't work if cut-through is used to reduce
latency.  When the two are joined, about the only processing is in the
MAC, its mostly loading a shift register and serializing but it also
does the FCS and sticks that *at the end*.  The MAC is quite
inflexible as it is mostly silicon gates designed to do some minimal
processing for a layer-2 such as Ethernet or GFP/OTN.

Think for a moment what N x 100 Gb/s means.  For small packets 150
Mpps per interface with Ethernet overhead (large overhead).  That is
20 clock cycles per packet for a 3 GHz clock rate.  Divide that by N.
This has to be done in specialized hardware and if you also think
about the memory bandwidth needed, you need very wide and very fast
DRAM and get one write when the packet arrives and one read before it
leaves and even then have to play tricks in the SRAM cache like
concatonating short packets going out the same queue.

BTW - "transparent TCP proxies" don't need to look at the payload for
the purpose of updating a checksum.  Whenever a header modification is
made you only need to know the old checksum, the old information and
the new information replacing it.  That is why IP checksum
modification can be done quickly.  Typically only TTL changes.  These
are also running at least a decimal order of magnitude or two slower.

Curtis