Re: [mpls] Last Call: <draft-ietf-mpls-in-udp-04.txt> (Encapsulating MPLS in UDP) to Proposed Standard

Joe Touch <touch@isi.edu> Tue, 28 January 2014 16:57 UTC

Return-Path: <touch@isi.edu>
X-Original-To: mpls@ietfa.amsl.com
Delivered-To: mpls@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 443961A0146; Tue, 28 Jan 2014 08:57:35 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.435
X-Spam-Level:
X-Spam-Status: No, score=-2.435 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RP_MATCHES_RCVD=-0.535] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id cRrmpIDEjhc4; Tue, 28 Jan 2014 08:57:34 -0800 (PST)
Received: from darkstar.isi.edu (darkstar.isi.edu [128.9.128.127]) by ietfa.amsl.com (Postfix) with ESMTP id 00FE91A0240; Tue, 28 Jan 2014 08:57:33 -0800 (PST)
Received: from [192.168.1.91] (pool-71-105-87-112.lsanca.dsl-w.verizon.net [71.105.87.112]) (authenticated bits=0) by darkstar.isi.edu (8.13.8/8.13.8) with ESMTP id s0SGv15Q005298 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NOT); Tue, 28 Jan 2014 08:57:05 -0800 (PST)
Mime-Version: 1.0 (Mac OS X Mail 7.1 \(1827\))
Content-Type: text/plain; charset="us-ascii"
From: Joe Touch <touch@isi.edu>
In-Reply-To: <201401280129.s0S1TGY5099912@maildrop2.v6ds.occnc.com>
Date: Tue, 28 Jan 2014 08:57:00 -0800
Content-Transfer-Encoding: quoted-printable
Message-Id: <4E4D36AF-2CB4-431C-BE01-8560BC46B0A3@isi.edu>
References: <201401280129.s0S1TGY5099912@maildrop2.v6ds.occnc.com>
To: curtis@ipv6.occnc.com
X-Mailer: Apple Mail (2.1827)
X-ISI-4-43-8-MailScanner: Found to be clean
X-MailScanner-From: touch@isi.edu
Cc: joel jaeggli <joelja@bogus.com>, "mpls@ietf.org" <mpls@ietf.org>, IETF discussion list <ietf@ietf.org>
Subject: Re: [mpls] Last Call: <draft-ietf-mpls-in-udp-04.txt> (Encapsulating MPLS in UDP) to Proposed Standard
X-BeenThere: mpls@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Multi-Protocol Label Switching WG <mpls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mpls>, <mailto:mpls-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/mpls/>
List-Post: <mailto:mpls@ietf.org>
List-Help: <mailto:mpls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mpls>, <mailto:mpls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 28 Jan 2014 16:57:35 -0000

On Jan 27, 2014, at 5:29 PM, Curtis Villamizar <curtis@ipv6.occnc.com> wrote:

> 
> In message <52E6B272.4030703@isi.edu>
> Joe Touch writes:
> 
>> On 1/27/2014 11:19 AM, Joel M. Halpern wrote:
>>> Yes Joe, routers could ahve been built to do those calcualtions at that
>>> performance scale.
>>> There are however two major problems:
>>> 
>>> 1) That is not how routers are built.
>>> 2) The target performance scale is rather higher.
>>> 
>>> So could someone build an ASIC to do what you want?
>> 
>> Has. It's already part of nearly every DMA ASIC in a network interface
>> already.
> 
> There is no DMA ASIC in these designs.  There is no shared memory to
> DMA from.  A router is not a specialized PC.

Agreed; the point was that it isn't an ASIC, but a very small function that can be included in any data transfer mechanism.

> BTW - the PCIe FCS is at the end of the transmission not the front so
> the DMA ASIC doesn't have to read memory twice, the first time to
> compute a checksum to put at the front.

Right, and until you modify UDP to have a trailer checksum, or use a different encapsulation that has one (e.g., via an IPv6 option), then you need to deal with the fact that the UDP checksum would require a store-and-forward delay.

IMO, if you don't like that, then use another encapsulation.

>>> Probably.  Is there
>>> any reason in the world to expect operators to pay the significant extra
>>> cost for such?Not that I can see.
>> 
>> We're talking about a ring of full adders, the specs for which are
>> given in an RFC that's 18 years old, and that is already implemented
>> in nearly every host interface, including 10Gps NICs.
>> 
>> And we're talking about "routers", many variants of which operate at
>> very high speeds and transparently proxy TCP already. So this is a
>> solved problem.
> 
> See prior email.  They don't need to look at the payload to modify IP
> or TCP headers and then update the checksum.
> 
>>> And even if we could and they would, that is not the world into which we
>>> are deploying these tunnels.
>> 
>> We're back to "that's not what they do now", at least in some devices.
>> 
>> Well, they don't use MPLS in UDP (since no spec exists), so clearly if
>> they're limited to doing what they already do, this is an exercise in
>> futility.
>> 
>> Joe
> 
> You seem to be missing the point that MPLS over UDP is not considered
> a good solution going forward on which to base the design of new
> hardware, but rather an interim solution to accomodate old hardware
> that doesn't load split MPLS traffic.

We live with interim solutions for decades.

I'd be OK with something that says that the UDP checksum SHOULD be used, but that legacy deployments MAY ignore it.

That solves what happens in the short term, but doesn't saddle us all with a poor solution if/when it persists.

> In any case, two passes, one to compute a checksum and put it on the
> front, would increase latency.

Again, latency is a complex issue, and store-and-forward delays inside routers is irrelevant for all but specialty deployments (stock trading esp.), and in those cases the use of additional encapsulation should be avoided anyway.

Joe

>  If anything UDP-Heavy with an FCS at
> the end would be used, even though two FCS is considered bad form.
> 
> Curtis
> 
> 
>>> Yours,
>>> Joel
>>> 
>>> On 1/27/14 1:53 PM, Joe Touch wrote:
>>>> 
>>>> 
>>>> On 1/27/2014 10:48 AM, joel jaeggli wrote:
>>>>> On 1/27/14, 8:48 AM, Joe Touch wrote:
>>>>>> Those same mechanisms have provided hardware checksum support for a
>>>>>> very long time.
>>>>> 
>>>>> The new header and the payload are actually in different parts of the
>>>>> forwarding complex until they hit the output queue, you can't checksum
>>>>> data you don't have.
>>>> 
>>>> You can (and some do) the checksum component parts when things go into
>>>> memory; the partial sums can be added as the parts are combined in the
>>>> output queue.
>>>> 
>>>> I appreciate that we're all taking about what might be done, but the
>>>> reality is that there are many 'transparent TCP proxies' that have to do
>>>> this, so there's clearly a solution, and it clearly runs fast enough.
>>>> 
>>>> Joe