Re: [RAM] Tunneling overheads and fragmentation

Robin Whittle <rw@firstpr.com.au> Sat, 21 July 2007 14:40 UTC

Return-path: <ram-bounces@iab.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1ICG8T-0003v6-Ka; Sat, 21 Jul 2007 10:40:45 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1ICG8S-0003j8-BB for ram@iab.org; Sat, 21 Jul 2007 10:40:44 -0400
Received: from gair.firstpr.com.au ([150.101.162.123]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1ICG8P-0001Pz-US for ram@iab.org; Sat, 21 Jul 2007 10:40:44 -0400
Received: from [10.0.0.8] (zita.firstpr.com.au [10.0.0.8]) by gair.firstpr.com.au (Postfix) with ESMTP id C217159DEB; Sun, 22 Jul 2007 00:40:34 +1000 (EST)
Message-ID: <46A21AD6.2060501@firstpr.com.au>
Date: Sun, 22 Jul 2007 00:40:22 +1000
From: Robin Whittle <rw@firstpr.com.au>
Organization: First Principles
User-Agent: Thunderbird 2.0.0.5 (Windows/20070716)
MIME-Version: 1.0
To: ram@iab.org
Subject: Re: [RAM] Tunneling overheads and fragmentation
References: <469F7673.6070702@firstpr.com.au> <20070720140433.GA69215@Space.Net>
In-Reply-To: <20070720140433.GA69215@Space.Net>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
X-Spam-Score: 0.3 (/)
X-Scan-Signature: b132cb3ed2d4be2017585bf6859e1ede
X-BeenThere: ram@iab.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Routing and Addressing Mailing List <ram.iab.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ram>, <mailto:ram-request@iab.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ram>
List-Post: <mailto:ram@iab.org>
List-Help: <mailto:ram-request@iab.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ram>, <mailto:ram-request@iab.org?subject=subscribe>
Errors-To: ram-bounces@iab.org

I just read:

  IP Encapsulation within IP
  http://tools.ietf.org/html/rfc2003

  Generic Routing Encapsulation (GRE)
  http://tools.ietf.org/html/rfc2784

  MTU and Fragmentation Issues with In-the-Network Tunneling
  http://tools.ietf.org/html/rfc4459

and the first few pages of:

  Packetization Layer Path MTU Discovery
  http://tools.ietf.org/html/rfc4821


I have added:

  IPv4 Reassembly Errors at High Data Rates
  http://tools.ietf.org/html/draft-heffner-frag-harmful-05

    IPv4 fragmentation is not sufficiently robust for use under
    some conditions in today's Internet.  At high data rates, the
    16-bit IP identification field is not large enough to prevent
    frequent incorrectly assembled IP fragments, and the TCP and
    UDP checksums are insufficient to prevent the resulting
    corrupted datagrams from being delivered to higher protocol
    layers. ...

to the To-Do list:

  http://www.firstpr.com.au/ip/ivip/to-do/	


Gert's response and the responses of others makes me think that
these proposals - LISP-NERD/CONS, eFIT-APT and Ivip - cannot
succeed in their goal of requiring no changes to hosts, except
perhaps where host packet sizes are always controlled by DHCP
settings.

Even then, I fear that in order to preserve both reachability and
efficiency (and any reachability problems which arise from
fragmentation), that hosts in all networks, including non-upgraded
networks, will need to adopt a somewhat lower MTU setting - for
all the packets they send.

I wonder to what extent every possible application respects the
operating system's MTU.  Do operating systems simply refuse to
send any packet which is longer?  I wonder if there are any widely
used applications, such as games, P2P programs etc. which are
hard-coded to assume a certain MTU which is close to, or right at,
the limit of what can safely be sent across most of the Net.

Perhaps, in an optimistic scenario, recognising that some packets
are to be encapsulated by ITRs, an ISP network could set up its
DHCP system or whatever it is which gives DSL modems their
parameters, to reduce the MTU and MSS settings sufficiently that
the ITR-applied encapsulation still results in packets which will
not be fragmented in most transit and border routers.  I am
assuming a single level of encapsulation is all that is required.

Then, the hosts could generate marginally shorter packets - for
all packets sent, including those to non-mapped addresses and the
packets which need to be encapsulated will get to the other end
without any fragmentation.

This really needs to be done for all hosts in all networks - not
just hosts in networks which have been upgraded with ITRs and ETRs
etc.

Because there are no prospects of ramping BGP up to coping with
millions of advertised prefixes, every vaguely practical,
potentially incrementally deployable proposal for the routing and
addressing crisis involves tunneling.

I think the most likely way this fragmentation and MTU problem is
going to be solved is to back off the MTU and MSS settings on all
hosts, for all packets.  Even if the host had its own ITFH
(Ingress Tunnel Function in Host) function, I don't see how the
operating system could tell application programs that there is one
MTU and MSS setting for packets going to some addresses and
another setting for packets going to other addresses.

So unless someone figures out a totally different system, the only
way out of the current crisis will be to force all application
software on every host on the Net to generate somewhat shorter
packets.

If so, then I think this would be an argument in favor of the
shortest possible encapsulation system, which I think for IPv4 is
the IP-in-IP technique (RFC 3003), which adds 20 bytes.

UDP encapsulation, as used by LISP and I think eFIT-APT, involves
20 bytes for the IP header, 8 bytes for the UDP header and some
number of bytes, such as 4, for extra stuff which presumably
precedes the encapsulated packet itself.  For instance 32 bits of
length and other guff and a 32 bit address of the ITR, to identify
the ITR if my proposal for "outer SA = inner SA" is adopted.  So
that is 32 bytes of overhead.

This is where IPv6's long addresses and headers become really
ugly. There would be 40 bytes for IP-in-IP and 52 for basic UDP
encapsulation.

In my previous message:
    http://www1.ietf.org/mail-archive/web/ram/current/msg01729.html

I suggested a second reason (in addition to the original one of
making it easy to stop ETRs being used as a back-door around
filtering) for using "outer SA = inner SA": to allow ICMP messages
to go straight back to the sending host.  This would absolve the
ITR of the problematic, onerous or impossible task of receiving
ICMP messages coming back to it, and figuring out which sending
host to send back some ICMP message to.  (See RFC 4459.)

As far as I know, "outer SA = inner SA" is at odds with standard
practice, but I think it has some important benefits.

  - Robin


_______________________________________________
RAM mailing list
RAM@iab.org
https://www1.ietf.org/mailman/listinfo/ram