[rrg] LMS alternative critique

Robin Whittle <rw@firstpr.com.au> Sat, 20 February 2010 08:31 UTC

Return-Path: <rw@firstpr.com.au>
X-Original-To: rrg@core3.amsl.com
Delivered-To: rrg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 75A6B3A7DE5 for <rrg@core3.amsl.com>; Sat, 20 Feb 2010 00:31:14 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.583
X-Spam-Level:
X-Spam-Status: No, score=-1.583 tagged_above=-999 required=5 tests=[AWL=-0.003, BAYES_00=-2.599, HELO_EQ_AU=0.377, HOST_EQ_AU=0.327, SARE_MILLIONSOF=0.315]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lMVzHtTp4gqR for <rrg@core3.amsl.com>; Sat, 20 Feb 2010 00:31:11 -0800 (PST)
Received: from gair.firstpr.com.au (gair.firstpr.com.au [150.101.162.123]) by core3.amsl.com (Postfix) with ESMTP id 7588C3A80DD for <rrg@irtf.org>; Sat, 20 Feb 2010 00:31:10 -0800 (PST)
Received: from [10.0.0.6] (wira.firstpr.com.au [10.0.0.6]) by gair.firstpr.com.au (Postfix) with ESMTP id 0C00B175D43; Sat, 20 Feb 2010 19:32:58 +1100 (EST)
Message-ID: <4B7F9E39.2030800@firstpr.com.au>
Date: Sat, 20 Feb 2010 19:32:57 +1100
From: Robin Whittle <rw@firstpr.com.au>
Organization: First Principles
User-Agent: Thunderbird 2.0.0.23 (Windows/20090812)
MIME-Version: 1.0
To: RRG <rrg@irtf.org>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
Subject: [rrg] LMS alternative critique
X-BeenThere: rrg@irtf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IRTF Routing Research Group <rrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/listinfo/rrg>, <mailto:rrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/rrg>
List-Post: <mailto:rrg@irtf.org>
List-Help: <mailto:rrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/rrg>, <mailto:rrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Sat, 20 Feb 2010 08:31:14 -0000

Here is an 865 word critique of the 2009-12-24 version of the LMS
proposal (right click the Adobe Reader display of a page of this file,
once saved locally):

http://docs.google.com/fileview?id=0BwsJc7A4NTgeMzRlYWYzYjEtZTkyOS00ZjgwLWI5YjItYmUyNzJjNTIyZTJi&hl=en

There is currently a critique in the RRG Report ID:
  http://tools.ietf.org/html/draft-irtf-rrg-recommendation-04#section-8.2

I am not sure whether this applies to the earlier LMS proposal or to the
significantly updated 2009-12-24 version.

 - Robin





Layered Mapping System
----------------------

LMS is a step towards designing a complete Core-Edge Separation routing
scaling solution for both IPv4 and IPv6, somewhat along the lines of
LISP-ALT.

There are insufficient details in the proposal to evaluate how well the
basic infrastructure of ITRs and ETRs would work, considering the
unknown nature of mapping delays, reliability of the global mapping
system, the problems of Path MTU Discovery (due to tunneling via
encapsulation) and difficulties with an ITR reliably probing
reachability of the destination network via multiple ETRs.

Most of the proposal concerns a new global, distributed, mapping query
system based on the ALT concepts of each node in the tree-like structure
being a router, using tunnels to other such nodes, and all such nodes
using BGP to develop best paths to neighbours.

By a series of mathematical transformations the designers arrive at
certain design choices for IPv4 and IPv6.  For IPv4, if the entire
address space was to be mapped (and at present there is no need to do so
beyond 224.0.0.0) there would be 256 Layer 2 nodes.  Therefore, each
such node would be the authoritative query server for mapping for
however much "edge" space was used in an entire /8 of 16 million IPv4
addresses.  This is a wildly unrealistic demand of any single physical
server, considering that there are likely to be millions of separate
"edge" "EID" (LISP) prefixes in each such /8.  A single address in a
single prefix of such "edge" space could be extremely intensively used,
with tens or hundreds of thousands of hosts connecting to it in any 1
hour period - and in many instances, each such connection resulting in a
 mapping query.

If such an arrangement were feasible, there's no obvious reason why BGP
and tunnels would be used at all, since each ITR could easily be
configured with, or automatically discover, the IP addresses of all
these nodes.

A more realistic arrangement might be to have the "edge" space broken
into a larger number of sections, such as 2^22 (4 million) divisions of
2^10 IPv4 addresses each.  If (and even this is questionable, though it
would frequently be true) each such division could be managed by a
single node (a single authoritative query server), then it would be
perfectly feasible for each ITR to automatically discover the IP
addresses of all 2^22 potential query servers.  In practice, in some
cases, no such query server would be required, since none of those 1024
IPv4 addresses were "edge" space.  In other cases, due to low enough
actual query rates, a single server could handle queries for multiple
sets of 1024 IPv4 ranges of "edge" space.

A simple ground-up engineering evaluation would thus produce a much more
practical solution than the highly contrived top-down mathematical model
- which considered only storage requirements for mapping data in each
query server, and not the volume of queries.

The IPv6 arrangement of two layers (a third layer is the single top
node) seems even more unrealistic.  Although the IPv6 address space is
likely to remain sparsely used for a long time, due to its inherent
vastness, the LMS plan calls for up to 2^24 Layer 2 nodes, each with up
to 2^24 Layer 3 nodes underneath.  This proposal seems wildly
unrealistic as stated - since each such node would need to accommodate
up to 2^24 + 1 tunnels and BGP sessions with its neighbours.
Furthermore, the top node (Layer 1) has to cope with most query packets,
since there is no suggestion that Layer 2 nodes would be fully or even
partially meshed.

Even with some unspecified caching times, the prodigious query rates
considered in section 3.2.2 cannot be suitably handled by either of the
IPv4 or IPv6 structures in the proposal.

While there is reference to redundant hardware for each node, there is
no discussion of how to implement this.  This is one of the problems
which so far has bedevilled LISP-ALT: - how to add nodes to this highly
aggregated network structure so that there is no single point of
failure.  For instance, how in the IPv6 system could there be two
physical nodes, each performing the role of a given Level 2 node, in
topologically diverse locations - without adding great complexity and
greater numbers of tunnels and BGP sessions?

The suggestion (section 5) that core (DFZ) routers not maintain a full
FIB, but rather hold a packet which does not match any FIB prefix, pay
for a mapping lookup, await the result, update the FIB (a frequently
expensive operation) and then forward the packet - is also wildly
unrealistic.

It is important to research alternative approaches when existing methods
are perceived as facing serious problems, as is the case with LISP-ALT.
 In this case, the proposed solution is not likely to be any improvement
on any ALT arrangement which is likely to arise by a more hand-crafted
design methodology.

The LMS proposal, as it currently stands, is far too incomplete to be
considered suitable for further development ahead of some other
proposals.  It represents the efforts of a creative team to improve on
LISP-ALT, and does not necessarily mean that all such attempts at
improvement would lead to such impractical choices.