[rrg] AIS (Evolution) - discussion/critique

Robin Whittle <rw@firstpr.com.au> Mon, 22 February 2010 16:57 UTC
Message-ID: <4B82B7EB.6090802@firstpr.com.au>
Date: Tue, 23 Feb 2010 03:59:23 +1100
From: Robin Whittle <rw@firstpr.com.au>
Organization: First Principles
User-Agent: Thunderbird 2.0.0.23 (Windows/20090812)
MIME-Version: 1.0
To: RRG <rrg@irtf.org>
Content-Type: text/plain; charset="ISO-8859-1"
Content-Transfer-Encoding: 7bit
Subject: [rrg] AIS (Evolution) - discussion/critique
Precedence: list
Short version:   I think some of the AIS techniques may be useful in
                 principle to some DFZ-using ASNs (ISPs and large PI
                 using end-user networks).  Whether they would be
                 on-balance useful, considering their costs and
                 complexity, to many ASNs, I am not sure.

                 I think the AIS techniques only provide marginal
                 improvements in the ability to run a network with
                 routers which can't handle the full DFZ, either in
                 their RIBs or FIBs.  We need multiple orders of
                 magnitude in the number of end-user networks - and
                 AIS is supposed to cope with them all using PI
                 prefixes.

                 I believe the only way AIS could be considered a
                 potential solution to the routing scaling problem is
                 if it was a better documented standalone system, with
                 convincing arguments for why this would be superior
                 to the best of the CEE and CES architectures.

                 I am completely opposed to the suggestion within
                 this proposal that CEE be adopted in the long-term.


Hi Lixia, Lan and colleagues,

Thanks for your replies to my questions (msg05862, msg05991 &
msg06055).  I have now fully read your proposal and partially read
some of the supporting documents.

In addition to commenting on the several techniques you propose, I
disagree with what I think is an inaccurate and overly bleak
assessment you make about the attractions and benefits of Core-Edge
Separation architectures in general.  I think what you wrote about
LISP is entirely incorrect.

  - Robin


Main documents
--------------

  Summary:

      http://tools.ietf.org/html/draft-irtf-rrg-recommendation-04#section-14.1

  ID:

      http://tools.ietf.org/html/draft-zhang-evolution-02

  Critique - compiled by two of the proponents, Beichuan Zhang and
  Lan Wang (msg05693 2010-01-18) - "a summary of comments our team
  has collected, mainly from past RRG meetings and emails".

      http://tools.ietf.org/html/draft-irtf-rrg-recommendation-04#section-14.2



Supporting documents
--------------------

  FIB Aggregation
  http://tools.ietf.org/html/draft-zhang-fibaggregation-02

  FIB Aggregation with Virtual Aggregation
  http://tools.ietf.org/html/draft-ietf-grow-va-01

     (This seems to be a replacement for the ID cited
      in evolution-02: draft-francis-intra-va-01 .)

  Performance of Virtual Aggregation
  http://tools.ietf.org/html/draft-ietf-grow-va-perf-00

  Virtual Aggregation (VA)
  http://tools.ietf.org/agenda/76/slides/grow-5.pdf

    (Mentioned in the Summary but not in evolution-02.)

  Answers to questions:
  http://www.ietf.org/mail-archive/web/rrg/current/msg05991.html
  http://www.ietf.org/mail-archive/web/rrg/current/msg06055.html


For a discussion of the distinction between Core-Edge Separation
(CES) and Core-Edge Elimination (CEE) architectures, please see:

  CES & CEE are completely different (graphs)
  http://www.ietf.org/mail-archive/web/rrg/current/msg05865.html

  CES & CEE: GLI-Split; GSE, Six/One Router; 2008 sep./elim. paper
  (v2)
  http://www.ietf.org/mail-archive/web/rrg/current/msg06089.html

"EUN" means End-User Network.



Discussion/critique
===================


Overly negative assessment of CES architectures
-----------------------------------------------

>From the Summary:

   Most proposals take a revolutionary approach that expects the
   entire Internet to eventually move to some new design whose
   main benefits would not materialize until the vast majority of
   the system has been upgraded; their incremental deployment
   plan simply ensures interoperation between upgraded and legacy
   parts of the system.

I think this is true of CEE proposals.  Please see (msg05865) as
noted above.  This statement is not an accurate portrayal of CES
architectures.  Again, please see (msg05865) for a full explanation.

Although not directly related to this AIS-Evolution proposal, Lixia
recently wrote, in a critique of LISP (msg06032):

   One would not be able to see global routing table size
   reduction unless/until LISP has been adopted by significant
   number of networks.

This is not as harsh a criticism of LISP as the paragraph above, but
it still implies some kind of lack of proportionality between
adoption of this CES architecture and routing scaling benefits.  See
(msg06035) for Dino's objections to this and (msg06037) for my
objections.

>From draft-zhang-evolution-02 (Section 4):

   We believe that one fundamental difference is that all new
   designs have an implicit assumption that the entire system
   would eventually move to the new design.

This is not the case for Ivip, or I think for LISP.  The more they
are adopted, the better for routing scalability.  There's no
threshold level of adoption before which there are no routing scaling
benefits.  The scaling benefits are closely proportion to levels of
adoption.  Also, the benefit to the EUN adopting the Ivip- or
LISP-managed SPI (Ivip) or EID (LISP) space is not dependent on the
number of other networks which adopt it - provided there are a good
set of DITRs (Ivip) or PTRs (LISP) supporting the adopted "edge"
address space.

(This is not the case with CEE - substantial benefits to adoptors and
to scalable routing only occur after ~100% adoption by all networks,
not just those networks which want portability, multihoming and/or
inbound TE.)

But this doesn't mean the CES architectures have to be "full adopted"
to solve the routing scalability problem.  The DFZ today tolerates a
certain level of unscalable routing.  Solving the routing scalability
problem doesn't necessarily mean eliminating all PI prefixes.

As long as we can limit the growth to some tolerable level, or better
still, reduce the number of PI prefixes AND at the same time provide
much larger numbers of EUNs with the portability, multihoming and
inbound TE they need (ideally, mobility too) then for all practical
purposes, the routing scaling problem will be solved.

   No matter how much effort the designer puts into the
   incremental deployment step of a new design, the design
   itself does not start with the assumption that significant
   portions of the system would never adopt it.  Therefore, it
   is likely the case that the assumed benefit of the new design
   would be achieved only after a majority, if not the whole,
   of the system has deployed the design, and that the cost of
   incremental deployment would be minimized only then as well.

This is not true of LISP or Ivip - or probably of TIDR or
IRON-RANGER.  I think it was not true of APT either.

   The incremental deployment machinery is simply to glue
   together the part that has made the change and the rest
   that has not, to make the system function together at the
   intermediate, and hopefully transient, stage.  However
   the system as whole would be in a sub-optimal state until
   the new design gets fully deployed.  LISP can serve as an
   example here.

These statements are completely false in respect of LISP and Ivip at
least.  (I don't speak for the LISP team, but as a prominent critic
of LISP, if I defend LISP in some way, this should not be brushed off
lightly.)

   Like many others, we too hoped that our new design, APT, could be
   eventually deployed everywhere to put the routing scalability
   under control.

Perhaps when you imagined that APT would not make a contribution to
the routing scaling problem until it was very widely adopted.  I
disagree - it would help with routing scaling to the extent it was
adopted, provided the adopting ISPs were connected as a single cloud
(or at least as a very few clouds).

It is incorrect to imply that I, as the designer of Ivip, think it
needs to be "deployed everywhere" to control the routing scaling
problem.  I agree it would be best if the great majority of EUNs
which want portability, multihoming and inbound TE adopted Ivip -
because those which don't will either not get these benefits, or will
get them using unscalable PI space.  This is not "deployed
everywhere".  I think the LISP designers also recognise the need to
have LISP adopted by most EUNs, but that is not the same as
"everywhere" or 100% of EUNs.

   We gradually realized that it is infeasible to attempt to
   roll out a new routing framework (i.e. a clean separation of edge
   prefixes from the core routing system) in a vast deployed system.

I think by "clean separation" you may mean, at least in some
respects, what I now refer to as "Core-Edge Network Isolation".
Please see (msg06089) - but such isolation is not required by APT to
achieve all its routing scalability goals.

Preventing hosts (all of which must be on "edge" addresses, not just
those in networks which need portability, multihoming and inbound TE)
from sending packets to "core" addresses (DFZ and ISP routers, ITRs
and ETRs) is an explicit non-goal of Ivip.  LISP has no such goal
either.  In Ivip, sending hosts on "edge" SPI addresses or on
ordinary ("core") addresses can optionally have inbuilt ITR
functions, so they need to be able to send packets to ETRs - which
are on "core" addresses.

This statement is from the "critique":

   Compared to others, this proposal has the lowest hurdle to
   deployment, because it does not require all networks move
   to use a global mapping system or to upgrade all hosts,
   and it is designed for each individual network to get
   immediate benefits after its own deployment.

This implies that for LISP and Ivip to be deployed to a level which
improves routing scaling (which is any level at all - its just that
they need to be pretty widely deployed to reduce the routing scaling
problem to an acceptable level) that "all networks" need to use a
global mapping system.  This is not true - only those networks
adopting the system need to use the mapping system, because Ivip's
DITRs and LISP's PTRs will handle packets sent by hosts in
non-upgraded networks.  (The reference to upgrading hosts does not
apply to CES architectures such as LISP or Ivip - it applies to CEE
architectures.)

This also implies that other proposals including Ivip and LISP do not
enable each network to gain immediate benefits after its own
deployment.  This is a misleading portrayal of Ivip and LISP.  An EUN
which wants portability, multihoming etc. gets benefits by adopting
Ivip or LISP, because it can get these without needing PI space, or
much space at all.  An ISP adopting Ivip or LISP to the extent that
its customers can adopt it will get benefits simply because it is
helping its customers get what they want and need - even if there are
no other benefits to the ISP.


>From one of the answers to questions (msg05991):

   A major consideration for our work is incremental
   deployability and immediate scalability benefit to anyone
   who adopts our scheme. You may recall that we started with
   a map-and-encap scheme APT, which falls into the core-edge
   separation category. However, during the past couple of
   years working  on APT, we realized that it is difficult for
   a single entity to deploy APT and receive immediate benefit.

A single ISP deploying APT enables the EUNs which adopt APT's "edge"
space to use it via any ETR in the ISP.  This is no different from
what they could do already, with PI or PA space.  However, if
multiple ISPs adopted APT and formed an APT island, then the EUN
could use that space, in small sections (smaller than /24 - right
down to /32) at any ETR of any of the ISPs.  This would also enable
the EUN to multihome with any two or more of these ISPs.

If there are multiple APT islands, then due to the /24 restriction in
the DFZ, it is impossible to use "edge" space from a single /24
partly in one island and partly in the other.  The solution seems to
be to have all such "islands" become one by sharing their mapping via
tunnels.

I think this may be a valid criticism of APT, Ivip and LISP when
applied to the question of what does a single ISP gain from being the
first adopter.

But if multiple ISPs see the benefits of the CES architecture and
then go ahead and adopt it for those reasons - then the question of
benefits for the very first adopter do not apply.  I guess the first
person to own a telephone didn't find it very useful until a second
person got one.

But this does not mean that all CES proposals suffer from the same
problem with adoption in a general sense.  I think the most likely
impetus for Ivip adoption is not from a single ISP, such as in a
given city or country, but for an organisation which wants to provide
global mobility, for IPv4 and/or IPv6, using the TTR Mobility
architecture:

   http://www.firstpr.com.au/ip/ivip/TTR-Mobility.pdf

Such an organisation may well be an ISP, but it need not be.

For TTR mobility, there need be no ITRs or ETRs in any ISP network.
What is needed is a bunch of TTRs around the world, with a common
mapping system driving the ITRs.  Then, with appropriate software in
the mobile devices (now mass-market iPhones, Google phones etc.) the
mobile devices tunnel to nearby TTRs and get their own globally
portable IPv4 address (or IPv6 /64) which they can use from any
access network, including behind one or more layers of NAT.  Mapping
changes are not required for every change in access network.

I think it is OK to drop development of APT if you can't think of why
anyone would be an early adoptor.  However I do not believe you are
entitled to extrapolate this argument against APT to all other CES
architectures, without detailed analysis of them - which you have not
done.

I will soon be writing up an even more decentralised approach to Ivip
mapping than the one I mentioned in (msg05975).  This will involve
the companies which lease out SPI space from their Mapped Address
Blocks (MABs) doing more of the work, and the adopting EUNs running
their own ETRs, without any need for support from their ISPs.  This
new approach will also be fine for TTR mobility.  Eventually, as this
became widespread, ISPs would want to have their own ITRs, to stop
packets leaving their networks, going to the nearest DITR only to be
returned again in tunnels to the ETRs in the ISP's network which
connect to the destination networks.

The AIS-Evolution approach is inherently incremental and intended for
any ISP or any PI-using EUNs with one or more DFZ routers.  It
doesn't depend much or perhaps at all on how many other networks
adopt it.  These are advantages, but I don't think you should portray
all other approaches as being as completely flawed regarding benefits
to early adoptors - unless you examine them all and provide detailed
arguments why this is the case.  It is the case with all CEE
architectures - but you haven't provided detailed critiques of LISP
or Ivip.

Perhaps a trap in these discussions is to think that the only benefit
an ISP would get by adopting a CES architecture would be the
reduction in the number of prefixes in the RIBs and FIBs of its DFZ
routers.  These will only be reduced, or their growth held in check,
to the extent that the CES architecture is widely enough adopted to
tempt PI-using EUNs, or any other EUNs which would have adopted PI
space, to use CES-managed "edge" space instead.

If this was the only measure of benefit for an ISP, then I agree, it
would take a long time (if ever) for any of the CES architectures to
"pay off" - since it would rely on many other ISPs adopting it as well.

However, assuming other ISPs are also "adopting the CES architecture"
- at least whatever is required for the ISP's paying EUN customers to
use the CES-architecture's SPI addresses via this ISP's network -
then there is a second set of benefits for each such ISP independent
of the number of prefixes in the DFZ.  This benefit is that the ISP's
customers are benefiting from their new SPI address space.  This is
fundamentally good for the ISP, because if their customers couldn't
adopt the CES "edge" space through this ISP, they would be inclined
to find another ISP.

However, I agree - if all that happened one day was a single ISP
adopted APT, LISP or Ivip, there does not appear to be any direct
benefit to the ISP or their customers.  This is not the only way CES
architectures can be adopted.  Adoption may not depend on ISPs being
first movers at all - at least for Ivip.

The concept of an ISP "adopting a CES architecture" is rather
simplistic.  Perhaps with APT this was an all-or nothing affair -
since I think ETRs could only be in ISP networks which had fully
adopted APT with its mapping system connections, Default Mappers,
ITRs etc.   Perhaps this is the case with LISP too, but I doubt it.

It is certainly not the case with Ivip.  Any EUN can run their own
ETR on the PA space they get from any ISP.  So when an EUN adopts
Ivip - meaning it uses SPI space for its hosts - it doesn't need an
"Ivip-upgraded" ISP.  The ISP doesn't have to do anything to allow an
EUN from implementing an ETR on the one or more PA addresses the ISP
provides.  Generally, these should be fixed IP addresses, or at least
stable from one week to the next.  There doesn't have to be ITRs in
the EUN, or in the ISP network, for the EUN to use SPI space via an
ETR function it runs itself.

I will write more on this soon.  For Ivip, at least, I am foreseeing
the adoption to be started by one or more companies setting up MABs
and widely dispersed DITRs to support these MABs - and then letting
EUNs run their own ETRs on any PA space they get from any ISP.  Also
there is the use of same system for TTR Mobility - which is probably
a more financially attractive service to provide than portability,
multihoming and inbound TE for non-mobile networks.


Goals of AIS-Evolution
----------------------

Thanks to prompting from Iljitsch van Beijnum two or so years ago,
the Ivip-arch ID now has a full set of goals and non-goals.  I think
all scalable routing proposals would benefit from this.

I understand that AIS-Evolution does not aim to directly support
mobility in any way. (msg06055):

   mobility support is considered best supported separately from the
   DFZ routing system.

But if current approaches to mobility are adequate, then why aren't
they widely used in IPv4 or IPv6?  As far as I know, the only way of
doing mobility properly is for the host to have its own global
unicast address it keeps no matter where it is.  The only way I know
of doing this is the TTR Mobility approach.  If you think that
existing approaches to mobility are better than this, then I would
appreciate you doing a proper critique of TTR Mobility.

So AIS-Evolution is only aimed at routing scalability for non-mobile
networks.

I asked about whether the aim was for a short-term preliminary to a
longer-term CEE architecture:

   RW>>  suggests it is near-term preliminary to a longer-term
   RW>>  host-based solution - implicitly a Core-Edge Elimination
   RW>>  scheme.

   LW> Your interpretation is not exactly what we meant to say.
   LW>
   LW> We aim to solve the routing scalability problem both in the
   LW> near term and in the long term through increasing scope of
   LW> aggregation.

OK, with no mention yet of CEE, AIS-Evolution is supposed to solve
the routing scaling problem in the short-term and long-term -
implicitly indefinitely.

   LW> The following paragraph from our proposal says that our
   LW> solution is orthogonal to a host-based solution.

OK - "host-based solution" means CEE architecture.  If AIS-Evolution
can solve the routing scaling problem in the long term, then why
would a CEE architecture be needed?

   LW> Since a host-based solution will not solve the routing
   LW> scalability problem until a large portion of the Internet
   LW> hosts adopt it, ...

Until essentially all hosts adopt it, meaning not just the hosts in
networks which need portability, multihoming and inbound-TE, but
*all* hosts everywhere, in ISP networks and in the many customer
networks which are working fine from PA space.  CEE only works for
IPv6 and only provides significant scaling benefits or significant
benefits for adoptors after almost *all* hosts have adopted it fully.

For this to occur, essentially *all* applications need to be
rewritten to run on IPv6 and then on IPv6-CEE.  Yet they need to run
on IPv4 too - so this is essentially triple-stack, with applications
supposedly working fine using some mixture of the different kinds of
addressing and protocols.  This is an extraordinarily steep set of
barriers to overcome before anyone gets any benefits.  (GLI-Split can
use existing IPv6 applications - but I am yet to understand how it
would work well.)

CEE also means burdening all hosts with extra work to manage their
new routing and addressing responsibilities.  I think this would be a
bad thing for all hosts - delaying the establishment of
communications in general.  I think it would be even more of a burden
for mobile hosts.

If you think CEE is a better approach to routing scalability than a
CES architecture such as Ivip or LISP, I would appreciate you
explaining exactly why you believe this, including addressing the
problems of this extra work and delays in establishing communications:

   Today's "IP addr. = ID = Loc" naming model should be retained
   http://www.ietf.org/mail-archive/web/rrg/current/msg05864.html


   LW> ... we still need a short-term solution (in our proposal,
   LW> the first two steps FIB aggregation and Virtual Aggregation
   LW> will serve this purpose).  If a host-based solution never
   LW> gets widely deployed, the later steps in our proposal (e.g.
   LW> inter-AS VA) will address the long-term scalability needs.

OK, so I understand the goals are something like this:

    FIB Aggregation      }   Do this in the short term - they
    Virtual Aggregation  }   won't get in the way of CEE.

    Then either:

       1 - Wait for a CEE to be developed and deployed - which
           means all applications will be willingly updated by
           their developers - to the point where it looks like it
           will be ~100% adopted and therefore solve the routing
           scaling problem:

              No need for Inter-AS Virtual Aggregation

    or 2 - Decide that no CEE will ever be adopted and therefore
           deploy Inter-AS Virtual Aggregation.

So I understand you regard Inter-AS Virtual Aggregation as a
second-best approach compared to a successful CEE architecture.

But you have not provided a proper critique of CES architectures.
Your only critique so far has been how difficult they are to get
widely adopted, due to the apparent absence of benefits for the first
adopting ISP.  As noted above, I don't consider this to be a full
critique of all the ways a CES architecture could become widely
adopted.   You have not suggested that CES couldn't solve the routing
scalability problem.

Yet after rejecting CES based on a very narrow and incomplete
argument, you intend to ignore CES architectures, and look only to
CEE architectures for the real solution to the routing scaling
problem - with your own Inter-AS Virtual Aggregation as a second-best
alternative if there is no such CEE.  But CEE architectures are
vastly more difficult to get adopted than any CES architecture.
Furthermore, the CEE architectures involve generally degraded session
establishment times, due to the need to perform at least one and
typically two extra global mapping lookups.

Yet you haven't pointed out a particular CEE architecture as being
worthy of consideration as a proper scalable routing solution,
although you do mention ILNP (Summary):

   Note that our proposal neither interferes nor prevents
   any revolutionary host-based solutions such as ILNP from
   being rolled out.  However, host-based solutions do not
   bring useful impact until a large portion of hosts have
   been upgraded.  Thus even if a host-based solution is
   rolled out in the long run, an evolutionary solution is
   still needed for the near term.

Another goal is implicitly support for CEE (msg06055):

   Another missing item from the proposal is end identifier: We
   believe it is important to have solutions for end identifiers,
   even though AIS does not depend or make use of it.

But you don't present any arguments for why Locator / Identifier
Separation (CEE) is desirable.

>From the above, I perceive:

    FIB Aggregation      }          Short term
    Virtual Aggregation  }

    Inter-AS Virtual Aggregation    Long term if no CEE solution
                                    looks feasible

These seem to correspond with points 1, 3 and 4 in the 6 point list
in the Summary.

Point 2 is "Topology-based mode Virtual Aggregation".  This seems to
involve tweaking the arrangements for RRs (Route Reflectors).

Point 5 involves reducing the load on DFZ routers by implementing the
RIB and eBGP functions on a commodity server. But how do these
"controllers" do their BGP work if they can't use the data ports of
the actual router?  I can't see where this is discussed in the ID.

The ID itself has five steps - apparently not including Point 5.

Overall I see the goals of AIS-Evolution (not counting your support
for CEE) as local attempts to get routers with limited FIB and RIB
capacities to cope with a greater total number of DFZ routes.

What I think is strikingly missing from the goals is the actual
reduction of the number of routes in the DFZ.  This is an explicit
goal of both CES and CEE architectures.

So I think AIS-Evolution is a defensive, local, complex set of
approaches which will enable some older routers to be used for longer
as the number of prefixes in the DFZ grows - at the cost of:

  1 - Greater complexity of management and interconnection.

  2 - More tunneling - and so more PMTUD difficulties, unless
      every tunnel is fully RFC 1191 / RFC 1981 compliant, by
      generating valid PTBs to the sending host for every PTB
      generated by a router in the tunnel.

  3 - Longer paths inside adopting networks as packets go through
      more complex arrangements of VA routers.

  4 - Reduced robustness compared to the same number of routers
      each being able to handle the full set of routes.

  5 - If VA tunnels are to the routers of other ASNs, greater
      problems with coordinating these than with today's simpler
      techniques.


Step One: Local FIB Size Reduction
----------------------------------

I wonder to what extent the techniques proposed would actually
improve on the behaviour of commercial routers.  Some of them would,
I am sure, but at the cost of potentially creating routing loops or -
as you acknowledge in the critique, upsetting Reverse Path Forwarding
checks.

To what extent can these techniques be adapted to commercial routers?
 Why haven't they been used already?

To what extent does the extra complexity increase the time taken to
alter the FIB?  This would depend a great deal on the techniques used
to implement the FIB.  TCAM can have very long worst-case update
times, even for a small update, since in order to make the change, it
is possible that large numbers of entries must be written into new
locations - since the order of the comparisons in terms of the
priority encoder is important.  While this is happening, the TCAM
can't be used to classify packets.

There are millions of end-user networks wanting portability,
multihoming and/or inbound TE.  Do you expect AIS to enable all ISPs
to cope with DFZ prefix numbers in the millions?  At best, I see all
your approaches, including FIB aggregation, as being of only marginal
benefit when we need multiple orders of magnitude improvement in our
ability to support end-user network prefixes being globally portable
and multihomable.

At least FIB aggregation is a process localised inside each router -
unless it causes undesirable paths for packets.


3.2. Step Two: Network-Coordinated FIB Size Reduction
-----------------------------------------------------

Although this and some other sections refer to external IDs, I think
these sections really need to be improved with diagrams and examples.

Only with examples, and extrapolation to handling millions of
prefixes, could the reader begin to imagine the problems of scaling,
management complexity, debugging, automation of router configuration
this and other suggested arrangements would involve.  I think that
the case for these suggestions would be greatly improved with more
detailed discussion of how this would scale to DFZ prefix numbers of
1 million, 5 million or whatever it is you expect these methods to be
worth using with.


3.3. Step Three: Reducing Adjacent AS Virtual Aggregation Overhead
------------------------------------------------------------------

I haven't followed this in detail.  I understand the idea of
tunneling from a router in ISP direct to a BR of another ISP where
packets addressed to a particular prefix are being sent out to some
other network.

I wonder how many such tunnels will be needed, and what capacity
existing routers have to support such tunnels, with full PMTUD
support.  I understand the idea is to do this only for the prefixes
with most traffic, and let most prefixes use tunnels only within the
one AS.  But how is this sort of thing to be automatically and
dynamically managed, as prefixes change, their traffic patterns
change, and their egress routers change?  It seems that writing the
appropriate software for this and then debugging it would be quite
daunting.  Then, even if the software was functionally perfect, it
would be quite difficult to correctly configure and operate in a busy
network.

I am not clear how far this principle should be extended.  Should a
router in ISP-A tunnel directly only to a BR of a directly connected
ISP-B, or should it also tunnel to a BR of another ISP-C which
connects to ISP-B but not ISP-A?


3.4. Step Four: Reducing RIB Size
---------------------------------

I didn't clearly follow the previous step, but does it involve more
work for the route processor - more prefixes in the RIB?  These
sentences seems to indicate it does:

   When more networks have adopted Virtual Aggregation, the
   mapping table is likely to grow large, which may make it
   no longer feasible to piggyback all the mapping information
   on the existing BGP sessions.  The main problem, as we can
   perceive today, would be the RIB size growth: A BGP router
   will receive the same mapping information from multiple
   neighbouring BGP routers, and store all of it in its
   Adj-RIBs-IN.  Thus BGP routers may end up with storing
   multiple copies of the same mapping information.

I don't clearly follow the proposed solution - of moving some of the
work to a separate BGP instance.  But if this is on the same router,
does this help, since there is only so much CPU power and RAM
available for all such instances?

I couldn't understand "caching" in:

    Since APRs (or ingress routers, if they are upgraded to
    handle caching) ...

This paragraph (page 12) is clearly saying something crucial about
the impact of AIS-Evolution, as indicated by:

    The prefixes that got aggregated out of the core routing
    system would be those that belong to the edge ASes.

but I have no understanding of how the individual actions of ISPs in
adopting some or all of these AIS-Evolution measures would have the
effect of reducing the number of prefixes handled by DFZ routers of
ISPs which do not adopt these measures, which is what I understand by
the term "core routing system".


3.5. Step Five: Insulating the Core from Edge Churns
----------------------------------------------------

Here I think there is some attempt to reduce the load on the DFZ in
general - all DFZ routers including those of ISPs and larger
DFZ-involved PI-using end-user networks, irrespective of whether
these networks adopt some or all of the AIS-Evolution techniques.

Anything which reduces the number of updates will help reduce the
routing scaling problem - assuming that traffic flows are not
adversely affected.  But the following text makes me think you want
to do something, but haven't figured out what yet:

   Short failures, which are frequent, should not be propagated
   through the mapping system.  Instead, they should be handled by
   other means.  For example, in the APT design, the failure handling
   actions are data-driven, i.e., a link failure to an edge network
   is not reported unless and until there are data packets that are
   heading towards the failed link.  We are actively working on an
   evolutionary solution that can provide equivalent data-driven
   handling of edge failures as APT does.

But if, by "data-driven" you mean that sometimes you won't need to do
anything because no data was flowing anyway, then this does not seem
to be a realistic solution.  I run a mailserver and nameserver here
in my home-office, with a very low volume web-server.  Hardly second
goes by without traffic.   Your "not reported unless and until there
are data packets that are heading towards the failed link" wouldn't
save much work for any network which was big enough to want
portability or multihoming.

It is not at all clear to me how the changes you propose can provide
multihoming service restoration in a more scalable way than with PI
space today.  To contribute to the solution of the routing scaling
problem, this needs to be done with much less burden on the DFZ
control plane than at present.

I asked about multihoming service restoration - but the answers
(msg06055) don't give me any idea of how this can be achieved with
the new techniques in a way which reduces the burden on the DFZ as a
whole.

RW:    2 - How does AIS support multihoming?  The word does not
           appear in the proposal documents or the summary.

           How does the multihoming support detect failure of the
           link between one ISP and the end-user network while
           detecting that the link from a second ISP is working, and
           that the second ISP itself is reachable?

           How is this information relayed to or discovered by, the
           routers (APRs?) which tunnel traffic packets to "egress
           routers"?

LZ:   the above are good questions that are not explicitly addressed
      in the 500 word solution summary. Essentially, AIS assumes that
      BGP works as today, thus edge sites following their existing
      multihoming practices. Transit ASes perform internal FIB/RIB
      aggregation to maintain FIB/RIB size at desired level, without
      impacting multihoming/ TE practices.

      Since the network would operate as usual, failure detections
      between edge sites and their ISPs are handled by routing
      protocols as today. the mapping information between aggregated
      prefixes (virtual prefix) and the specific ones comes directly
      from BGP routing updates.


Tunneling and its impact on Path MTU Discovery
----------------------------------------------

These answers from (msg06055) doesn't answer my question:

LZ:  AIS does not specifically address the PMTU problem. The above
     mentioned solutions can work if they get deployed.

But these "solutions" are specific to other frameworks.  Perhaps Fred
Templin's SEAL could be used - but no current router supports it as
far as I know.  My IPTM technique is specifically for Ivip and is not
intended for the sort of tunneling you envisage with AIS-Evolution.

If you use tunnels, I think they must return valid PTBs to the
sending host if they get a PTB from the tunnel.  For IPv4, this means
the ingress tunnel router needs to cache a few dozen bytes at least
of the initial packet, since it can't rely on getting enough of the
offending packet from the router in the tunnel which generates the
PTB to be able to create the valid PTB to the sending host.

LZ:  I'd also like to make a personal observation here: I heard that
     MPLS ran into PMTU problem in its early days, it eventually went
     away by reducing default MUT size; the same approach has also
     been used to avoid PMTU problem in Apple's MobileMe
     implementation (which applied end-end encapsulation between
     hosts)

If this approach is taken with any widely adopted scalable routing
solution then we will be locked into ~1500 byte packets until the
system is entirely replaced.  Until then, we would never be able to
send ~9k byte jumboframes across the DFZ.


PI prefixes are aggregated, not removed as such
-----------------------------------------------

RW:    5 - I understand that some or ultimately all ISPs would run
           APRs.  I assume that when all ISPs run APRs that this will
           enable the removal of end-user network prefixes from the
           DFZ.

LZ:    All ISPs can run run APRs to reduce their routing table size.
       None is required to run APRs to reduce other ASes' routing
       table size.  One AS runs APRs if and only if it wants to
       reduce its own table size.

OK - this confirms what I thought, that there is no reduction in
number of prefixes for DFZ routers in ASes not using these techniques.

RW:        When only some ISPs run them, is there any prospect for
           reducing the number of prefixes advertised in the DFZ?
           If so then how would end-user networks whose prefixes
           were removed receive packets sent by hosts using ISPs
           without APRs?

LZ:   the #prefixes can be reduced *between* neighbour ASes doing AIS.
      No end user prefixes get removed per se; they get aggregated.

I think I can understand VA within an AS as reducing the need for at
least some routers to have the full set of DFZ prefixes in their FIBs.

I think I can imagine this extending in some way by routers in one AS
tunneling packets to routers in a neighbouring AS.

But, without diagrams and much more extensive explanations I can't
yet see how either or both of these could occur:

    1 - Two neighbouring ASes are somehow able to handle multiple
        PI prefixes of end-user networks somewhere else in the world
        without any routers in their systems needing to have separate
        RIB and/or FIB routes for each prefix.

    2 - How this could be extended not just to immediate neighbours,
        but to neighbours' neighbours - and so potentially to all
        ASNs, in a robust, scalable fashion.


Fully-deployed AIS-Evolution?
-----------------------------

RW:     6 - What advantage would ILNP or any other CEE architecture
            provide compared to a fully deployed version of AIS?

LZ:     "a fully deployed version of AIS" does not seem a well formed
         statement, given AIS is a solution for parties who need it.

Well, everyone with a DFZ router will need it if ten million end-user
networks all advertise their own PI prefixes!

I meant that if all your suggested techniques were fully adopted by
pretty much every AS which is affected by the routing scaling problem
- pretty much every organisation with a DFZ router.

LZ:      Although a uniform universe would make a simpler picture, it
         seems that the actual deployment of Internet has already led
         to different practices. Personal view: such heterogeneity in
         IP operations is likely going up over time.

         Back to your question of what advantages a full ILNP or any
         other CEE deployment would bring: I am not sure about
         advantages or not advantages, what I can see as differences
         are

           1 - major changes in operations in edge networks, to
               handle  multiple provider-assigned prefixes

           2 - increased dependencies of the core on all edge
               networks doing the right thing.

OK - I understand these two points (which I numbered) as being
difficulties with a CEE architecture:

  1 - All edge networks (not just those wanting portability and
      multihoming - all networks with hosts, including ISP's
      hosts) need to change the stacks of all their hosts and
      except perhaps for GLI-Split, have all their applications
      rewritten.  Also, drop IPv4 and get IPv6 services for all
      networks.

  2 - I don't understand this.  With CEE, the edge networks only get
      PA prefixes from their ISPs, so there's nothing they can do
      to destabilise the DFZ routing system.  Assuming the ISPs only
      change their advertisements for reconfiguring their networks
      and for rare outages, then the DFZ will be fine - with only
      ISP prefixes.

My question was why would you suggest a CEE architecture be used at
all if AIS can really solve the routing scaling problem, however defined?


RW:       7 - Do you consider a full AIS deployment to be a Core-Edge
              Separation architecture, in that it creates a subset of
              the global unicast address space?

LZ:       see my comment to the prev question.

          in the first evolution draft we envisioned that, if all
          networks deployed virtual aggregation, then the world would
          converge towards (do not know whether it would ever reach)
          the direction of a core-edge separation.

          I do not know what you meant by "a subset of global unicast
          space"...

I meant to write "creates an 'edge' subset of the global unicast
address range which is managed differently, with ITRs, mapping system
and ETRs etc. so these addresses can be used by end-user networks for
portability, multihoming and inbound TE with little or no impact on
the DFZ control plane."

You were one of the authors of the 2008 paper which established the
terms "Core-Edge Separation" and "Core-Edge Elimination".  The
defining characteristic of CES architectures such as APT, LISP, Ivip
and others is that they create a new subset of the global unicast
address space as just described.  I would appreciate you responding
to my attempt to sort out these conceptual and terminological
difficulties with "separation" ("isolation of edge hosts from sending
packets to core addresses"?), CEE and CES:

   CES & CEE: GLI-Split; GSE, Six/One Router; 2008 sep./elim. paper
   (v2)
   http://www.ietf.org/mail-archive/web/rrg/current/msg06089.html


RW:     8 - Why would AIS, or AIS supplemented by a CEE architecture,
            be better than LISP or Ivip?  (I do not consider TIDR to
            be a solution because it uses DFZ routers for mapping
            distribution.  I don't yet understand RANGER, but perhaps
            it too does this.)

             (Note 2010-02-23:  RANGER uses separate routers, it does
              not use the DFZ for mapping distribution.)

LZ:     regarding LISP: (1)please see the critique I sent out couple
        days back, where I mentioned a basic question of whether one
        could draw a universal division line between "the core" and
        all edges;

LISP doesn't attempt to do this.

LZ:     one of the reasons we gave up APT is because APT also had
        this model of divided world view.

As best I understand it, you had a vision of "isolation" as an
end-point of full APT adoption - but this is not required for
achieving all its routing scaling benefits, as far as I know.

LISP and Ivip have no such goal of "isolation", so this is not a
critique of these CES architectures.

LISP and Ivip do distinguish clearly between two classes of network:

   1 - ISPs - who sell connectivity to other networks, including
       other ISPs.

   2 - End-User Networks - who buy and use connectivity and do not
       sell it to anyone.

LISP and Ivip provide a special subset of the global unicast address
space (EID for LISP, SPI for Ivip) which has special destination
Locator semantics for ITRs only, but which is otherwise handled
normally by all other routers, and by all hosts.   (msg06070) and
(msg06082).

This "edge" subset of space can be used by End-User Networks which
want portability, multihoming and inbound TE, with no direct burden
on the DFZ control plane.  Only the MABs (Ivip) or "coarse" prefixes
(LISP) which enclose the "edge" space of tens of thousands of
end-user-networks are advertised in the DFZ - and these
advertisements do not change frequently.  (Ivip's DITRs and LISP's
PTRs advertise them.)

With LISP and Ivip, end-user networks no longer need to use PI space
if they want portability, multihoming and inbound TE.  Ideally many,
most or all current PI users will convert their space to "edge"
space, or hand it back and adopt other "edge" addresses.  But even if
they don't hand it back, the routing scaling problem will be largely
solved because millions of end-user networks currently lacking
portability etc. will be able to get it scalably with the new "edge"
space.

I can't see how distinguishing conceptually between two types of
network is a problem for LISP or Ivip - if this is what you mean by
"draw a universal division line between 'the core' and  all edges;".
 If, by this, you mean "isolation" (see msg06089), then this is
definitely not part of LISP or Ivip - and nor was it needed for APT
to be a good scalable routing solution.


LZ:     (2) the issue of alignment (or lack of it) between cost and
        gains: if one ISP rolls out LISP, how much could it reduce
        its routing table?

I discussed this above - a single adopting ISP doesn't derive any
benefit.  But this is not the only way a CES will be widely adopted.

LZ:     The above two considerations apply to Ivip as well, plus a
        3rd one for Ivip: as I wrote in my Ivip critique, I question
        the validity of the model of globally synchronized mapping
        database.

OK - I really appreciate you writing a critique for Ivip.  However,
neither you nor anyone else has written about what their concerns
were about Ivip's specific proposal for mapping distribution:

  http://tools.ietf.org/html/draft-whittle-ivip-fpr-00

Likewise I am keen to get some detailed critiques of TTR Mobility:

  http://www.firstpr.com.au/ip/ivip/TTR-Mobility.pdf


Conclusion
----------

I have worked hard to understand AIS in general, though I haven't
tried to read all the other IDs which are referenced.

My impression is that you think most of the steps are suitable for
marginal improvements, considering the great demands of scalable
routing - and that you think a CEE approach is the proper long-term
solution.

However, you seem to have rejected CES architectures purely on the
basis of your rejection of APT - and I think that rejection was
largely or entirely based on concerns over initial stages of
adoption, without considering they may be other ways APT or a CES
architecture might become widely adopted.

Then, you seem to prefer a CEE architecture - all of which are
*vastly* harder for anyone to adopt than APT, LISP or Ivip - and
where the benefits to adoptors or scalable routing are far more
distant and unlikely ever to be realised.

I think some of the techniques you propose may be useful in principle
to some DFZ-using ASNs (ISPs and large PI using end-user networks).
Whether they would be genuinely useful, considering their costs and
complexity, to many ASNs, I am not sure.

I think your techniques only provide marginal improvements in the
ability to run a network with routers which can't handle the full
DFZ, either in their RIBs or FIBs.  We need to handle millions of
end-user networks with portability, multihoming etc.  If done with PI
addresses, this would mean the number of prefixes in the DFZ grows by
a factor of 10 or more.

I believe AIS-Evolution could only be considered a potential solution
to the routing scaling problem if you made it a standalone system,
with much better documentation - and presented convincing arguments
for why this would be superior to the best of the CEE and CES
architectures.

I am completely opposed to your suggestion of long-term adoption of
CEE - see (msg05865) and (msg05864) - so for me, you would only have
to show it is superior to Ivip, since I think Ivip is better than the
other CES architectures.
[rrg] AIS (Evolution) - discussion/critique Robin Whittle