[rrg] Scalable routing problem & architectural enhancements

Robin Whittle <rw@firstpr.com.au> Tue, 23 February 2010 06:11 UTC

Return-Path: <rw@firstpr.com.au>
X-Original-To: rrg@core3.amsl.com
Delivered-To: rrg@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 1DDBA28C209 for <rrg@core3.amsl.com>; Mon, 22 Feb 2010 22:11:24 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.578
X-Spam-Level:
X-Spam-Status: No, score=-1.578 tagged_above=-999 required=5 tests=[AWL=0.002, BAYES_00=-2.599, HELO_EQ_AU=0.377, HOST_EQ_AU=0.327, SARE_MILLIONSOF=0.315]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id W2fMB1s5lVvK for <rrg@core3.amsl.com>; Mon, 22 Feb 2010 22:11:20 -0800 (PST)
Received: from gair.firstpr.com.au (gair.firstpr.com.au [150.101.162.123]) by core3.amsl.com (Postfix) with ESMTP id 88E353A8411 for <rrg@irtf.org>; Mon, 22 Feb 2010 22:11:18 -0800 (PST)
Received: from [10.0.0.6] (wira.firstpr.com.au [10.0.0.6]) by gair.firstpr.com.au (Postfix) with ESMTP id B4D0C175A38; Tue, 23 Feb 2010 17:13:18 +1100 (EST)
Message-ID: <4B8371FF.60609@firstpr.com.au>
Date: Tue, 23 Feb 2010 17:13:19 +1100
From: Robin Whittle <rw@firstpr.com.au>
Organization: First Principles
User-Agent: Thunderbird 2.0.0.23 (Windows/20090812)
MIME-Version: 1.0
To: RRG <rrg@irtf.org>
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
Subject: [rrg] Scalable routing problem & architectural enhancements
X-BeenThere: rrg@irtf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: IRTF Routing Research Group <rrg.irtf.org>
List-Unsubscribe: <http://www.irtf.org/mailman/listinfo/rrg>, <mailto:rrg-request@irtf.org?subject=unsubscribe>
List-Archive: <http://www.irtf.org/mail-archive/web/rrg>
List-Post: <mailto:rrg@irtf.org>
List-Help: <mailto:rrg-request@irtf.org?subject=help>
List-Subscribe: <http://www.irtf.org/mailman/listinfo/rrg>, <mailto:rrg-request@irtf.org?subject=subscribe>
X-List-Received-Date: Tue, 23 Feb 2010 06:11:24 -0000

Here is my understanding of the scalable routing problem, and how
this is part of the broader question of how to devise a "once in
several decades" architectural enhancement for the IPv4 and IPv6
Internets.

 - Robin               http://www.firstpr.com.au/ip/ivip/



The scalable routing problem
============================

For brevity:

   EUN =   End User Network.

   PMHTE = Portability, Multihoming and/or Inbound Traffic
           Engineering.  These the the benefits many EUNs seek and
           which they can currently only gain by advertising PI
           prefixes in the DFZ - which is the cause of the
           scaling problems.  See definitions in point 4 below.

   CES =   Core-Edge (address) Separation architecture - such as
           LISP, APT, Ivip, TRRP, TIDR or IRON-RANGER.  See
           (msg05865).  CES architectures require no host changes
           and maintain the current two-level IPv4/v6 naming
           model.  (msg05864)

   CEE =   Core-Edge (address) Elimination architecture, such as
           GSE, ILNP, GLI-Split and Name-Based Sockets.  These
           only work for IPv6 and require host changes - with
           rewritten applications for ILNP and Name-Based Sockets.
           All CEE architectures alter the naming model to one
           of "Locator/Identifier Separation" (LISP does not do
           this - it is a CES architecture).


IPv4 scaling problem and solution constraints
---------------------------------------------

There are several broad aspects to the IPv4 routing scaling problem.
 No-one appears to know for sure how many DFZ routers there are, but
all Internet users pay for them indirectly, since all substantial
ISPs have DFZ routers - and any smaller ISP without DFZ routers is
relying on an upstream larger ISP which does.

Also, larger EUNs have DFZ routers, and Internet users in general pay
for the operating costs of many of these larger EUNs.  Furthermore
the reliability of Internet communications depends on the DFZ routers
quickly adjusting themselves ("converge") to be able to deliver
packets by suitably short paths, with as little packet loss as
possible due to congestion or paths leading to black holes.

Here are the aspects I perceive to the IPv4 routing scaling problem -
most of which also apply to IPv6.

  1 - The growth in the number of prefixes advertised in the DFZ:

        http://bgp.potaroo.net

      Currently about 300k with a doubling time of about 5 years.
      This figure drives several problems:

      a - Increasing expense for each DFZ router due to the
          correspondingly high number of prefixes in the FIBs.

      b - Potentially greater difficulties updating the FIB, such
          as computational cost and/or time the FIB can't be used
          for classifying packets while updates are applied.

      c - extra load on the RIBs due to the number of prefixes - with
          DFZ routers with more neighbours being more heavily
          affected in terms of the CPU and RAM requirements for
          maintaining a two-way conversation with each neighbour
          about each prefix.

      d - Exacerbating problems with overall stability of the DFZ
          control plane.  For instance, a single outage will affect
          more prefixes, causing more BGP processing, RIB and FIB
          updating, more BGP changed announcements etc.  Since
          routers take time to do this, and due to other problems
          with the BGP control plane (such as unwanted updates
          occurring due to combinations of topology and MRAI timer
          settings) the ability of the entire system to adapt to
          changes and "converge" is lessened.  "Converge" means
          firstly all routers finding best paths which acceptably
          forward traffic packets and secondly finding best paths
          which are stable without requiring any further changes.

  2 - Growth in rate of updates to how these prefixes are announced.
      The number of prefixes represents one dimension of the load on
      the BGP control plane.  Each change adds further load, so the
      number of changes in general adds to the routing scaling
      problem.  Some changes only require adjustments to best paths
      of a few DFZ routers.  Other changes could, in principle,
      require changes to best path, or at least changes to the
      AS details each DFZ router announces for the best path (even
      if the best path remains the same) affecting many, most or
      all DFZ routers.

  3 - It is widely agreed that the DFZ can and must scale to handle
      the prefixes advertised by ISPs - and that the scalability
      problem is due to unsustainable growth in the number of EUNs
      which advertise their own prefixes - and in the rates of
      change in the way these prefixes are added.  These prefixes
      are "PI" prefixes.  (I use this term to include an alternative
      arrangement with the same impact - an EUN with a PA prefix from
      one ISP causing it to be advertising in the DFZ by another
      ISP.)

  4 - Consequently, there are high barriers to EUNs obtaining PI
      space and advertising it in the DFZ.  These are not high
      enough to acceptably constrain the growth in the number of
      advertised PI prefixes, but they are already unacceptably
      high compared to the need for millions (up to 10 million)
      EUNs to gain benefits which are currently only available via
      advertising PI space.

      Advertising PI space in the DFZ is the only current method by
      which an EUN can currently obtain the PMHTE benefits:

          Portability - keeping its space when choosing another ISP.

          Multihoming - two or more ISPs with session survivability
                        when switching between them to cope with
                        ISP or link failure.

          Inbound TE -  When two or more ISPs are used, the ability
                        to steer incoming traffic - potentially of
                        different sorts - between these ISP links for
                        the  purposes of load balancing, optimising
                        costs, latency reliability or other
                        considerations.

      These high barriers reduce the number of EUNs which can gain
      these PMHTE benefits.  These barriers also increase the costs
      of those which are able to use PI space.

  5 - This leads to a more difficult to measure aspect of the
      problem: a large number of EUNs which want or need PMHTE
      benefits and are currently unable to obtain it, due to the cost
      and administrative barriers to obtaining and advertising PI
      space.

  6 - A further aspect of point 5 is that due to the convention of
      not propagating prefixes longer then /24 in the IPv4 DFZ -
      an EUN needs at least a /24 prefix before it can obtain PMHTE
      benefits.  EUNs probably need multiple /24s to be able to do
      inbound TE.

      While the total amount of space used in advertised /24s is
      only a fraction of a percent of advertised space, if it were
      somehow possible for millions of EUNs to have their own PI
      prefixes, the /24 limit would cause many of them to use more
      space than they actually need.  The much higher numbers of
      the smallest prefix - /24s - than any other length indicates
      that the majority of EUNs need 256 IPv4 addresses or less.
      (msg06092).


There appears to be broad agreement that the solution to these
problems cannot involve any of:

  1 - Souping up all DFZ routers with faster route processors,
      bigger FIB capacity etc.  (This will continue, but the
      current rates of growth in the DFZ and the growing number
      of EUNs which can't get PMHTE benefits, together with the
      increased costs for all DFZ router operators of paying for
      these more capable routers constitutes the ongoing nature
      of the routing scaling problem.)

  2 - Replacing the DFZ with anything fundamentally different.
      Firstly, it is not clear what would work better than BGP.
      Secondly, if something different would work better - there
      seem to be insurmountable barriers to adopting it.

  3 - Erecting financial or administrative barriers to EUNs obtaining
      PI space and advertising it in the DFZ.  This would be
      administratively difficult, would introduce competition
      policy problems and would merely contain one aspect of the
      scaling problem (the growth in the number of DFZ prefixes)
      by worsening the other aspects: the number of EUNs which can't
      get PMHTE benefits and the higher costs incurred by those who
      are able to advertise PI space.

  4 - Solutions to the "portability" problem along the lines of
      EUNs not having truly portable space (or in a CEE architecture,
      not having portable host identifiers) and having some
      supposedly "acceptable" means of automatically renumbering
      the hosts and routers in their networks, when changing ISPs.
      For many networks, no such mechanism can come close to the
      benefits of true portability, because the IP addresses of their
      network and hosts may be stored in many other places beyond
      their control, such as ACLs.

      For example, a university may subscribe to many journals and
      the journal sites give free access to hosts in the university's
      address prefix - using a simple router-based ACL.  Changing
      the address range of the entire university network, to use a
      new ISP, would involve complex, expensive and error prone
      administrative costs for the universities and all the journals
      it subscribes to.


Therefore, a solution to the IPv4 routing scaling problem would
involve, some combination of the following:

  1 - Large numbers of EUNs being able to gain PMHTE - Portability,
      Multi-Homing and inbound TE - for their networks in a manner
      which placed little extra burden on the DFZ control plane
      in general, and on the RIBs and FIBs of DFZ routers.

      There is no consensus on numbers of such EUNs with fixed
      networks which will want or need PMHTE benefits - but Brian
      Carpenter and I have independently suggested 10 million as a
      long-term upper bound.  (BC msg05801).  The RADIR Problem
      Statement refers to "millions": "there are millions of
      potential end sites which would benefit from being able to
      multihome".

  2 - Current EUNs with PI space being able to adopt - and actually
      adopting - some scalable alternative to their current PI
      arrangements.  Likewise, EUNs which in the future would
      in the absence of a solution, would advertise PI space in
      the DFZ.

However, the solution must not involve any of:

  1 - Less efficient use of IPv4 address space.  (This rules
      out CEE, since a CEE-using EUN with N hosts requires at
      least N addresses from each ISP.)

  2 - Changes to host stacks or applications - since this would
      rule out the possibility of the changes being widely enough
      adopted to solve the scaling problem.  See:

      List of constraints on a successful scalable routing solution
      which result from the need for widespread voluntary adoption
      http://www.firstpr.com.au/ip/ivip/RRG-2009/constraints/

  3 - Assuming that IPv4 usage will decline (as most end-users
      migrate to IPv6 and no longer need IPv4 space) fast enough
      to make the IPv4 scaling problem not worth solving.

Future trends for the IPv4 routing scaling problem are the subject of
debate.  Some argue that since IPv4 space is running out rapidly that
migration to IPv6 must begin soon and that therefore the routing
scaling problem in IPv4 will become either less significant and/or
will not grow as fast as in the past.

I believe the growth will accelerate as there is more slicing and
dicing of the available space to use it with more total hosts -
driving up the number of prefixes in the DFZ, of both EUNs and ISPs.
 See the (msg05946) thread mentioned below for more on this.

In several recent RRG messages (sorry, I don't have their numbers
handy) people have expressed the view that IPv4 will be important for
a very long time, or indefinitely.

I believe that attempting to solve the IPv4 routing scaling problem
by solving the IPv6 routing scaling problem and moving the great
majority of end-users to IPv6 would be analogous to solving the
Earth's global warming problem by solving any such problems on Mars -
and assuming that most of humanity will move to Mars since Earth is
so obviously over-crowded.


Since we can only develop and suggest the adoption of technologies to
solve the routing scaling problem, and since it seems we need wide
adoption (such as 90% percent or more) to substantially solve the
problem - considering the two or more orders of magnitude difference
between the current 300k prefixes and a likely 10 million figure - we
face some "constraints due to the need for widespread voluntary
adoption".  Please see my page where I attempt to describe these -
which has been improved after some RRG discussions:

  http://www.firstpr.com.au/ip/ivip/RRG-2009/constraints/



IPv6 scaling problem and solution constraints
---------------------------------------------

The IPv6 Internet currently has no scaling problem.  For a fuller
discussion, please see the thread:

   IPv4 & IPv6 routing scaling problems
   http://www.ietf.org/mail-archive/web/rrg/current/msg05946.html

     The IPv6 Internet has no scaling problem:

       http://bgp.potaroo.net/v6/as2.0/
       http://bgp.potaroo.net/v6/as6447/

     These indicate about 2.6k prefixes with a doubling time
     of about 20 months.  At these rates it would take 11.4
     years (2021) for this measure of the IPv6 scaling problem
     to reach the level of today's IPv4 scaling problem.  By
     then, at the current growth rates, the IPv4 figure would
     be about 1.46 million.

The IPv6 scalable routing problem - if and when it arises - is in
principle the same as IPv4's.  However, there are some differing
constraints on the IPv6 solution - and potentially some different
techniques which are applicable to IPv6 which won't work on IPv6.

The constraints on IPv4 solutions due to shortage of global unicast
address space don't apply to IPv6.  So CEE architectures could solve
the IPv6 problem.  CEE architectures are wasteful of global unicast
address space, since each multihomed EUN with a /X prefix of IP
address space for the Locators of its hosts needs to obtain a /X PA
prefix from each of its upstream ISPs.  Also, some CEE architectures
implement Locator / Identifier Separation by encoding both the host's
Identifier and its Locator into the one IPv6 address.

The Modified Header Forwarding techniques of Ivip - alternatives to
encapsulation for tunneling packets from ITRs to ETRs - are different
for IPv4 and IPv6:

   http://tools.ietf.org/html/draft-whittle-ivip-etr-addr-forw
   http://www.firstpr.com.au/ip/ivip/PLF-for-IPv6/

Also, with IPv6's much lower base of hosts and routers - and with the
lesser urgency of widely deploying a solution - there is probably a
lot more scope than in IPv4 for upgrading routers (including DFZ
routers), host stacks and perhaps applications.  Still, I believe
that requiring applications to be altered in any way presents the
greatest of all barriers to adoption.  This is particularly the case
for the updates which would be required for an IPv4 or IPv6
application to use the CEE naming model - Locator / Identity
Separation.  This may be easy for some applications, but would
require fundamental changes to protocols for others.

It is not absolutely necessary that the IPv4 and IPv6 routing scaling
problems be solved in the same way.  However, the task of making
applications run on these systems at the same time - or at least that
the one code-base and set of basic application protocols could work
on all these systems:

   IPv4
   IPv4 with scalability solution

   IPv6
   IPv6 with scalability solution

is a further argument against CEE.  Only CES architectures require no
changes to applications or stacks.  It would be wildly unrealistic to
expect all IPv4 applications to be altered in any way - for any
reason at all, including scalable routing.  Of the CEE RRG proposals,
only GLI-Split requires no changes to IPv6 applications - but I am
not yet convinced it will work.  (I am yet to get a response from
Michael Menth and colleagues to my msg06056.)


Other problems and goals
========================

I think it would be irresponsible and impractical to develop and
attempt to deploy architectural enhancements - each a once in several
decades opportunity for improving the IPv4 and IPv6 Internets - for
the purpose of solving only a subset of these problems:


   IPv4:   Routing scalability
           Address exhaustion
           Mobility


   IPv6:   Routing scalability
           Mobility

IPv4 address exhaustion and Mobility are discussed in sections below.

The forthcoming enhancements must be an effective solution for all
three IPv4 problems and both IPv6 problems.

First I want to mention some other potential problems which an
architectural enhancement might allow (or be constrained by), since
the IPv4 and IPv6 Internets face pressing problems beyond those
listed above.


Computer Security
-----------------

There are general problems of computer security, the ability of
attackers to directly gain control of hosts, or trick their users
into giving them control - but these do not seem to be amenable to
solution in the network itself.


DoS and other attacks
---------------------

There is a general, very serious, problem with the ability of
attackers to mount DoS and other attacks which disrupt Internet
communications.

This is largely a function of the ability of attackers to assemble
(or hire the services of) immense botnets of hundreds or thousands or
millions of compromised hosts, each capable of firing out packets to
the target.  This is largely a function of the number of
Net-connected computers, the insecurity of their operating systems
and applications, and the lack or care and expertise of their owners.
 The widespread adoption of faster DSL services - and especially
fibre services with much faster upstream links than are possible with
DSL, HFC cable or wireless links - will make this problem even worse.

If it looks like there might be some method of reducing this problem
as part of an architectural enhancement, I think this should be
explored.  The ability of a CES architecture such as Ivip or LISP to
in some way help with this problem has not really been explored as
far as I know.  I am not sure there is a way of doing so - but it
should be investigated as CES architectures are further developed.


Path MTU Discovery and packet length limits
-------------------------------------------

This is a whole can of worms for IPv4 and IPv6.  Please refer to the
thread:

  Fred's IPv4 PMTUD research: RFC1191 support frequently broken
  http://www.ietf.org/mail-archive/web/rrg/current/msg05910.html

for more information.

PMTUD problem apparently cause minor data loss today on IPv4.  The
problems generally involves some tunnels and networks either not
producing PTBs (Packet To Big messages) or dropping them with
filters.  There may also be a problem with some hosts (stacks and/or
applications?) not responding properly to PTBs.

These problems lead to workarounds in which packet lengths are
artificially limited - which means that it will be impossible to use
~9k byte jumboframe paths across the DFZ as these become available.
In short, the current situation seems to indefinitely lock us into
slightly sub-1500 byte packets.

There are several worrying things about this.  Firstly, the problems
result usually as a combination of circumstances - and can be
difficult to recognise, debug and isolate to the level of identifying
which ISP or other network did the wrong thing with their tunnel
arrangements and/or over-zealous ICMP filtering.

Secondly, each person who encounters these problems tends to adopt
quick-fixes which mask the fundamental problems - thereby enabling
the problems to continue and proliferate, while locking us more and
more into sub-1500 byte packet lengths indefinitely.

Thirdly, neither CES nor CEE routing scaling problems offer any
solution to these problems.

Finally, and most troubling, CES architectures need to use
encapsulation between ITRs and ETRs - unless one of Ivip's Modified
Header Forwarding techniques can be deployed, by upgrading all DFZ
routers, before the Ivip CES architecture itself is deployed.  (In
the long-term Ivip should be able to transition from encapsulation to
MHF.)  So PMTUD problems may make it significantly more difficult to
introduce a CES architecture.

I am unsure how serious these problems will be - but ITRs will need
to be able to send PTBs to sending hosts and have the sending hosts
respond with suitably shortened packets.  If the ITR is outside some
network where the sending host doesn't get these PTBs due to
over-zealous filtering of incoming ICMPs by that network, then this
will cause real difficulties with the CES architecture.  The CES
tunneling will frequently reduce packet sizes below what most
networks are used to getting at present.

There is only one proper answer to this situation - removing the
overzealous filtering.  Placing the ITR in the network would fix this
problem - but make it harder (not impossible, at least with Ivip's
IPTM arrangement) to do PMTUD for the tunnel to the ETR, because then
the overzealous filtering would drop PTBs coming from routers in any
part of the ITR to ETR tunnel.

Tunnels in the DFZ or other networks which are part of the ITR-->ETR
tunnel and which either don't produce valid PTBs, or only produce
them on the second or subsequent occasion they drop a packet, make it
more difficult for ITRs to successfully judge the Path MTU to the
ETR.  This is not insurmountable - but it delays the ability of the
ITR to correctly inform the sending host of the MTU it must use.

My view is that these PMTUD problems arise from people doing things
which are at odds with the Internet standards, and for which there is
no reasonable means of coping defensively.  So I believe it would be
easier and better to identify these bad practices and have these
problems fixed, rather than crafting new protocols, such as RFC 4821,
to "route around" the problems.   Fred Templin has a different view.

I think the PMTUD difficulties are probably worthy of a concerted
research response along the lines of the current phase of scalable
routing research, which was kicked off by the 2006 RAWS conference in
Amsterdam.


IPv4 address exhaustion and efficient utilization
-------------------------------------------------

This is a far more pressing and well accepted problem than scalable
routing.  It is pretty much common knowledge amongst all IT people,
while scalable routing is not so widely known.  Also, I think, even
some within the field regard it as a non-problem - well within the
capability of normal (Moore's law etc.) router technological
improvements.

I believe that a successful IPv4 scalable routing solution will, to
the maximum extent possible, "solve" the address exhaustion problem
by allowing very high rates of utilization - the percentage of
actively used IPv4 addresses out of each advertised prefix.  Each
IPv4 address may be used for a single host, a NAT box at the end of
DSL, fibre, HFC or wireless service, or a single mobile host.

I believe the only class of solutions which can work for IPv4 are CES
solutions and of the CES RRG proposals, only Ivip or LISP could be
successful.  (APT could also work, I think, but it is no longer being
developed.)

Both Ivip and LISP enable the slicing of "edge" space (SPI for Ivip,
EID for LISP) down to separately mapped chunks as small as a single
IPv4 address ("/32").  Ivip's micronets can be any integer number of
IPv4 addresses, while LISP only works in power-of-two prefixes.  But
both could, in principle and in practice, scalably allow "edge" space
to be used very flexibly and with a close to 100% utilization level.

This won't forever solve the IPv4 address shortage problem.  But
considering the improvements which can be made to utilization in this
manner, and the fact that only about 2.2 billion IPv4 addresses are
so far advertised, out of a total advertisable global unicast range
of 3.7 billion:

  "Advertised Address Space" in:
  http://bgp.potaroo.net/bgprpts/rva-index.html

I think the successful adoption of a CES architecture such as Ivip or
LISP will give us another 10 or more years of IPv4 usage without
really "running out" of addresses *except* for the desire to give 5
billion or more mobile devices their own global unicast scalable (SPI
or EID) "edge" address.  Most of those billions of mobile IP devices
are going to have to work either behind NAT for IPv4 and/or on IPv6.

Since I believe only Ivip or LISP or the like could solve the IPv4
routing scaling problem, and since I believe each would offer the
best possible means of maximising IPv4 address utilization, I don't
think anything further needs to be done regarding this problem.

Indeed, I think that the IPv4 address shortage will probably drive
the adoption of Ivip or LISP for non-mobile networks wanting PMHTE
benefits who don't need large amounts of space and/or which don't
want to get PI space, advertise it in the DFZ etc.


Mobility
--------

While no-one has data about the future, I consider these trends:

   Ubiquitous adoption of digital mobile telephony in developed
   and developing nations.

   Miniaturization of computers into hand-held devices for many
   purposes, including playing sound and video, for telephony,
   instant messaging (SMS texts and Internet IM protocols),
   web-browsing and other Internet applications, micro-payment
   systems, GPS capabilities, Bluetooth connections to other
   devices etc. etc.

   Ubiquitous adoption of Internet communications in developed
   and developing nations.

to be analogous to several freight trains approaching the one
junction at full-throttle.

One doesn't need a crystal ball ("data about the future") to be sure
that billions of end-users will soon want Internet access on their
hand-held devices (phones, pads, netbooks, laptops or whatever) AND
that they will want the effective identity of their computer, and its
ongoing communication sessions, to persist despite the device
automatically and frequently connecting to the Internet by a wide
variety of physical link technologies and through various access
networks.  Many of these connections will be completely ad-hoc - such
as a 3G network via roaming arrangements in a foreign country, or
using a WiFi service automatically, or with very little user
interaction, simply by being in range of a free service in a public
place.

I consider the upper limit for the number of these devices to be
approximately 10 billion - one or (sometimes two or more) for pretty
much every person on Earth, plus devices for selected pets, cars,
point of sale terminals, vending machines, trucks in vehicle tracking
systems etc.

Many of these devices will connect to IPv4 behind NAT, which
precludes the use of conventional mobile IP techniques.  NAT will
hopefully not be used for IPv6.

LISP proposes a form of mobility by which the MN (Mobile Node) acts
as its own ETR.  But there are many problems with this, not least its
inability to work behind NAT and the need for each MN to have an IPv4
address of EID space and its own IPv4 address of RLOC space.  See the
end of:

  http://www.firstpr.com.au/ip/ivip/lisp-links/#critiques


As far as I know, the only way of adequately solving the Mobility
problem - at least as I understand it, for IPv4 and IPv6, with hosts
accessing many networks on an ad-hoc basis, including behind one or
more layers of NAT - is the TTR Mobility architecture:

  http://www.firstpr.com.au/ip/ivip/TTR-Mobility.pdf

This is an extension to Ivip or perhaps to LISP.  Mapping changes are
not required for each new host address.  Mapping changes would be
infrequent for most MNs, since they are only needed (and even then,
not absolutely required) when the MN moves more than 1000km or so.

TTR Mobility doesn't require any new elaboration of the basic Ivip
architecture - though it would require many more micronets - up to 10
billion or so, rather than the 10 million or so required without
mobility.

Because of this, I don't see Mobility as being an extra burden on the
enhancement - as long as the enhancement is a CES architecture such
as Ivip or perhaps LISP.

CEE architectures theoretically support mobility, but not for IPv4,
not with the MN behind NAT - and at the cost of the MN having to do
much more routing and addressing work than the relatively simple
extra responsibilities it has in the TTR Mobility architecture.


Solutions
=========

I plan to write a separate message about this, but the short version is:

  Many RRG "proposals" are not complete proposals for scalable
  routing and can ignored.

  Three proposals are are potential solutions, and are not CEE or CES
  architectures: "Aggregation with Increasing Scopes" (AIS -
  Evolution), hIPv4 and Name Overlay (NOL).  I believe these are not
  practical solutions.

  There are four CEE proposals:

    GLI-Split
    ILNP
    Name-Based Sockets
    RANGI

  These are only applicable to IPv6, and they all require host stack
  changes.  All but GLI-Split require upgraded applications.  All of
  them only provide substantial benefits to adoptors and substantial
  routing scaling benefits after essentially all hosts (and therefore
  all applications) adopt the new system.  These are extraordinarily
  high adoption barriers which preclude them from being seriously
  considered, since there are CES architectures which could solve
  the scaling problems and provide mobility.  Even if there were no
  such barriers, I would oppose them because I believe their
  Locator / Identifier Separation naming model would decrease the
  speed of session establishment and unreasonably burden all hosts
  with extra traffic and responsibilities.  That said, I am not
  yet convinced any of these proposals would work as well as their
  proponents intend.  See (msg05865).

  This leaves four CES architectures:

    LISP (currently LISP-ALT, but perhaps in the future
          with another mapping system)

    Ivip (I will soon describe a Distributed Real Time Mapping
          System which will be suitable for Ivip and LISP.)

    TIDR

    IRON-RANGER

  TIDR doesn't solve the scaling problem, because the traffic
  handling and mapping distribution is done by DFZ routers.

  I think IRON-RANGER needs a lot of work before the scaling
  and security problems inherent in its continual process of
  EID prefix registration with VP routers can be understood
  and shown to be resolved.  I think it also lacks technical
  and business arrangements for supporting packets sent from
  non-upgraded networks.

  That leaves LISP and Ivip.  For reasons I won't repeat here,
  I believe Ivip would be a good solution to the problems
  discussed here and that LISP in anything like its current form
  would be a poor solution.  FWIW, if APT were still under
  development, I would consider it better than LISP, since
  like Ivip, it uses local full-database query servers to
  avoid delaying or dropping initial packets.


Other work
==========

Please see the RADIR Problem Statement:

  http://tools.ietf.org/html/draft-narten-radir-problem-statement-05

and my comments on it, which I will post after this message.  This
message also contains pointers to the RRG Goals ID.