Re: [RRG] Re: [RAM] Tunneling overheads and fragmentation

Iljitsch van Beijnum <iljitsch@muada.com> Tue, 11 September 2007 21:03 UTC

Return-path: <ram-bounces@iab.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1IVCtE-0006r5-NF; Tue, 11 Sep 2007 17:03:20 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1IVCtD-0006mc-AQ for ram@iab.org; Tue, 11 Sep 2007 17:03:19 -0400
Received: from sequoia.muada.com ([83.149.65.1]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1IVCtB-0007G7-QD for ram@iab.org; Tue, 11 Sep 2007 17:03:19 -0400
Received: from [82.192.90.28] (nirrti.muada.com [82.192.90.28]) (authenticated bits=0) by sequoia.muada.com (8.13.3/8.13.3) with ESMTP id l8BKxIhI018940 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=NO); Tue, 11 Sep 2007 22:59:21 +0200 (CEST) (envelope-from iljitsch@muada.com)
In-Reply-To: <46E6F514.1030206@gmail.com>
References: <469F7673.6070702@firstpr.com.au> <20070720140433.GA69215@Space.Net> <46A21AD6.2060501@firstpr.com.au> <0857530C-5C9D-4D29-ACAB-16A99CBFD929@muada.com> <46E6992D.2090501@firstpr.com.au> <46E6F514.1030206@gmail.com>
Mime-Version: 1.0 (Apple Message framework v752.3)
Content-Type: text/plain; charset="US-ASCII"; delsp="yes"; format="flowed"
Message-Id: <DCE587FE-A4E1-48AB-B378-44A163E2C227@muada.com>
Content-Transfer-Encoding: 7bit
From: Iljitsch van Beijnum <iljitsch@muada.com>
Subject: Re: [RRG] Re: [RAM] Tunneling overheads and fragmentation
Date: Tue, 11 Sep 2007 23:01:44 +0200
To: Brian E Carpenter <brian.e.carpenter@gmail.com>
X-Mailer: Apple Mail (2.752.3)
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 00e94c813bef7832af255170dca19e36
Cc: Robin Whittle <rw@firstpr.com.au>, RAM Mailing List <ram@iab.org>
X-BeenThere: ram@iab.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Routing and Addressing Mailing List <ram.iab.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ram>, <mailto:ram-request@iab.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ram>
List-Post: <mailto:ram@iab.org>
List-Help: <mailto:ram-request@iab.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ram>, <mailto:ram-request@iab.org?subject=subscribe>
Errors-To: ram-bounces@iab.org

[back on RAM because I believe that's where the LISP folks hang out  
and this is relevant for them, besides, people may think I'm a  
researcher...]

On 11-sep-2007, at 22:05, Brian E Carpenter wrote:

>>> I'm still operating under the assumption that both those places  
>>> are in
>>> ISP networks. Now obviously there are lots of places in ISP networks
>>> that only support 1500-byte packets,

> Can somebody provide evidence for this statement?

Minira#conf t
Enter configuration commands, one per line.  End with CNTL/Z.
Minira(config)#interface fastEthernet 1/0
Minira(config-if)#ip mtu ?
   <68-1500>  MTU (bytes)

Granted, this is a fairly old box in a small network, but I don't see  
anyone seriously claiming that ALL ISP networks support packets  
larger than 1500 bytes on ALL their internal links (and also on inter- 
ISP links).

That said, it may very well be feasible to engineer a "jack up"  
solution such that the encapsulated = larger packets only see parts  
of networks that support more than 1500 bytes with a level of effort  
that is within reason. (Let me put it this way: what's easier to  
support: a million prefix in the routing table or a 1600 byte MTU?)

>>> but what would be a better decision
>>> here: push out a reduced packet size EVERYWHERE which we probably  
>>> won't
>>> be able to raise any time soon, or require ISPs to either:

>>> 1. Implement path MTU discovery correctly, or:
>>> 2. Make sure that all encapsulated packets travel over paths that
>>> support at least 1500 byte + encapsulation sized packets

>> All these options look impossible or ugly to me.

> The first one has been eluding us for years, but the network
> still works. What's the evidence on actual deployment? (Also
> see below.)

Please note that this is about the network between *TRs, which is a  
much simpler universe than the host-to-host universe where weird  
OSes, NAT and firewalls get in the way. (Again, assuming that end- 
users don't get to run en/decapsulation boxes themselves.)

> The second one sounds like something that is in the ISPs'
> enlightened self interest, in which case it will happen.

Right. However, that would probably still be a deployment problem  
because we may have to wait for upgrades to happen. Elsewhere, I was  
more or less dragged into a discussion about tunnel MTU issues.  
Believe me when I tell you that  this is a minefield. However, the  
good thing with a LISP-like solution is that we get to design  
everything on all ends (ITR, ETR, mapping) so it's not too hard to  
implement fragmentation at the encapsulation layer. The issue with  
IPv4 encapsulation is that the ID space is too small to support  
decent packet rates. With something like LISP, we can implement an ID  
field there and severely tighten the reassembly window (maybe even go  
so far that fragments from one source must arrive in-order) so this  
works much better. The mapping system could distribute MTU/MRU  
information if that's helpful.

[...]

> Hence RFC 4821, which does need to get deployed.

RFC 4821 (PMTUD without ICMP) is great for TCP, but it's not  
reasonably implementable for most UDP-based protocols/applications.  
It also suffers from the problem that you don't know if your  
corresponent supports it so it requires a leap of faith that things  
will work out.

_______________________________________________
RAM mailing list
RAM@iab.org
https://www1.ietf.org/mailman/listinfo/ram