[RAM] ViP: Anycast ITRs in the DFZ & mobile tunnels

Robin Whittle <rw@firstpr.com.au> Fri, 15 June 2007 03:56 UTC

Return-path: <ram-bounces@iab.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1Hz2vV-00039z-Sn; Thu, 14 Jun 2007 23:56:45 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1Hz2vU-00039u-N4 for ram@iab.org; Thu, 14 Jun 2007 23:56:44 -0400
Received: from gair.firstpr.com.au ([150.101.162.123]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1Hz2vR-0001cr-MK for ram@iab.org; Thu, 14 Jun 2007 23:56:44 -0400
Received: from [10.0.0.8] (zita.firstpr.com.au [10.0.0.8]) by gair.firstpr.com.au (Postfix) with ESMTP id DB4FC59E45; Fri, 15 Jun 2007 13:56:38 +1000 (EST)
Message-ID: <46720DED.9090608@firstpr.com.au>
Date: Fri, 15 Jun 2007 13:56:29 +1000
From: Robin Whittle <rw@firstpr.com.au>
Organization: First Principles
User-Agent: Thunderbird 2.0.0.0 (Windows/20070326)
MIME-Version: 1.0
To: ram@iab.org
Content-Type: text/plain; charset=ISO-8859-1
Content-Transfer-Encoding: 7bit
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 1b82b4ba484bbe86cdae6d5f8b2d2ccb
Subject: [RAM] ViP: Anycast ITRs in the DFZ & mobile tunnels
X-BeenThere: ram@iab.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Routing and Addressing Mailing List <ram.iab.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ram>, <mailto:ram-request@iab.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ram>
List-Post: <mailto:ram@iab.org>
List-Help: <mailto:ram-request@iab.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ram>, <mailto:ram-request@iab.org?subject=subscribe>
Errors-To: ram-bounces@iab.org

Here are some ideas which as far as I know are novel.  I will write
them up as an I-D if anyone thinks they might be useful.  I may well
have made some blunders - please let me know where I am mistaken.

The name for this proposal is ViP - Versatile redIrection of
Packets.  "ViP" is pretty lame, but it worked well for Rock Hudson
in 1961 (Lover Come Back To Me, with Doris Day and Tony Randall).

  - Robin


This proposal is intended to be easily incrementally deployable,
while I think LISP would be difficult or impossible to incrementally
deploy.  I am assuming LISP 2 or higher - since 1 or 1.5 has no
benefits for the global BGP routing table size, as far as I know.

There's no mention of Identifiers or Locators here - just IPv4
addresses.  I am thinking in IPv4 terms, but the same scheme may be
useful for IPv6.


I am borrowing from LISP the concepts of the Ingress Tunnel Router
(ITR), the Egress Tunnel Router (ETR), the IP-in-IP tunnelling of
packets from ITR to ETR and ETRs being either border routers or
internal routers.  I assume a centralised database of information
which determines how a packet's destination IP address controls the
packet being encapsulated to be sent to a specific ETR address.  I
assume all the ITRs have a full up-to-date copy of this database,
via some chained or tree-structured, real-time, "push" update system.

The ITR functions in this scheme are performed by "V-routers",
meaning "Versatile".  In some instances a V-router may also perform
something like an ETR function, which I will call a TTR
(Translation Tunnel Router) but that is a separate issue.

The major differences with respect to LISP are:

1 -  LISP-mapped prefixes and individual IP addresses are not
     within any prefix which is advertised in the BGP system.

     ViP-mapped prefixes and individual addresses and are always
     part of a prefix which is advertised in BGP.


2 -  With LISP, the ITRs are always located in edge networks.
     They may be interior routers or the border routers.  They
     accept incoming packets on interfaces which are not connected
     to eBGP peers.

     With ViP, the ITR function is performed by a "V-router" which
     is always a BGP router.  The V-router may be a border router
     - single-homed or multi-homed - or a transit router.  If the
     V-router is a border router, it may accept incoming packets
     for encapsulation on any of its interfaces: those connecting
     to eBGP peers and those connecting to internal routers.

     It makes most sense if the V-router ITR function is implemented
     as part of the workload of a multihomed border router or a
     transit router.  If this is the case, ViP places all ITRs in
     the DFZ.

     However, it would be technically possible to have the ITR
     function performed by a single-homed border router or by
     specialised router which has only a single interface which
     connects to a transit router or to any border router.  In
     principle, the ITR function could be performed by the
     host which originates the packet, but I intend that the
     ITR function have a full copy of the global mapping database
     so this would be extremely costly.


3 -  For LISP-mapped addresses to be universally reachable, all
     edge networks need to implement ITR functions, either at
     their border router(s) or at one or probably many internal
     routers.  (Alternatively, edge networks would need to change
     their routing system to forward all packets not covered by a
     BGP advertised prefix to some external router which would
     provide the ITR function.)

     A ViP-mapped address will be universally reachable as long
     as there is a single reachable V-router which can receive
     packets from the original sender and send encapsulated packets
     to the ETR for this address.


4 -  Point 3 has a profound effect on how the two proposals may
     be incrementally deployed.  With LISP, there is a serious
     and perhaps insurmountable barrier to anyone deciding to
     connect a host to a LISP-mapped address.  The first people
     to do this will find their addresses generally unreachable
     since virtually no edge networks will have upgraded any
     of their internal or border routers to perform LISP ITR
     functions.  Why should they?  It is a huge investment to
     enable communications with a handful of rebels who are
     voluntarily opting out of the BGP system (albeit for the
     benefit of humanity).

     I can't imagine how LISP would ever get off the ground,
     since early adopters would have extremely patchy
     connectivity.  No-one would put a crucial service on a
     LISP-mapped address.  Since all crucial services would
     be available on ordinary BGP-routed addresses, why should
     edge networks spend money on a new or upgraded router?

     ViP requires no changes in the edge network of the sender.
     It can be implemented with a single ITR on some BGP router
     somewhere in the world, and a single ETR in the edge
     network of the destination host.


5 -  LISP assumes a single system.  I will assume a single
     ViP system in most of the discussion below, but there
     could be any number of independent ViP systems in
     operation, without conflict.  Indeed, a ViP system could
     be a profitable enterprise.


6 -  Because LISP needs to be a single system, it needs to be
     standardised by the IETF and IANA.  I think ViP could be
     implemented without any new IETF standardised protocols
     or IANA assignments.  I am not suggesting this be done -
     it would be best to create either a single unified ViP
     system or standards by which multiple ViP systems could
     all work in the same way.


7 -  One of the primary benefit of ViP, together with providing
     reachability from the outset without requiring sending
     edge networks to do anything, is that there would be a smaller
     number of ITRs in the system.  These will tend to be bigger
     than with LISP and will have more intensive work to do.

     I am thinking of direct RAM-based lookup approaches, involving
     two or so memory cycles, to speed this operation in hardware.
     Perhaps one or more models of current high-end router,
     such as the CRS-1, could perform these functions nicely,
     with an extensive firmware upgrade.


8 -  With ViP, packets to be encapsulated reach the edge of the
     BGP system or are forwarded within it, either to a single
     ITR (if the ViP system only has one V-router) or to one
     of multiple ITR functions in separate V-routers all of
     which have the same IP address.

     ViP uses Anycast to provide multiple redundant paths
     within the BGP system for packets to find their way to
     an ITR.  Anycast is not normally used for TCP
     communications, but I am hoping it will be appropriate
     here, since the individual ITRs only encapsulate the
     packet and tunnel it to the ETR.

     Assuming the sender Host A (HA) has a BGP-mapped address and
     the Host B (HB) destination address is ViP-mapped, the
     packets in the opposite direction use ordinary BGP routers
     to make their way from HB to HA.  If HA also has a ViP-mapped
     address, the same process occurs in the reverse direction,
     with the packets leaving HB, and being directed towards the
     nearest anycast V-router, which performs the ITR function and
     tunnels the packet to the ETR for HA.

     If the border routers of both edge networks are V-routers,
     which perform ITR functions and these same routers are also
     the ETRs for their hosts, then the paths of the packets will be
     the same in both directions.


9 -  LISP and what I will call "ViP-basic" both have their ETRs
     in edge networks, or at the border router.  These ETRs must
     have an interface with an IP address which is reachable via the
     global BGP system.  These only handle packets one way - from
     the sender HA to the LISP/ViP-mapped destination address of HB.
     Packets in the return direction HA <- HB are sent normally,
     without relying on tunnelling, if the destination HA's address
     is not LISP/ViP-mapped.  Otherwise, a similar process occurs
     with the packet being intercepted by an ITR and tunnelled to
     the ETR which handles HA's address.

     I will also describe a ViP-mobile system where the tunnel
     endpoint is an individual host, which establishes the
     tunnel with either the ITR or a TTR (Translating Tunnel
     Router) both of which are BGP routers with BGP-routable
     addresses.  The host requires suitable tunnelling software
     in its operating system or application program, but can
     be located anywhere on the Internet, including behind NAT.

     It is conceivable that a fully developed approach to this
     mobility idea could provide multiple tunnels over different
     upstream links from the target to one or more TTRs.  The
     tunnel between the TTR and the host would be bi-directional
     so the mobile host's current ISP doesn't need to know about
     the mobile host's ViP address, for instance for allowing
     upstream packets from this address.


10 - With LISP, there are an unknown number of ITRs, since there
     must be one or more in every edge network if LISP-mapped
     addresses are to be fully reachable.  With a single ViP system,
     there are a finite and generally smaller number of known ITRs,
     all part of the BGP network.  This makes it practical, or more
     practical than with LISP, to run a tightly meshed system for
     distributing updates of the mapping database to each ITR in
     real-time.

     If there was a single global ViP system then the ITRs would
     be operated by multiple ASes.  So a considerable amount of
     coordination would be required.  If there were multiple ViP
     systems, then the ITRs of each system may or may not be run
     by the one AS.


11 - Since ViP has a smaller number of routers performing the ITR
     functions, it is more practical to have them contain a complete
     up-to-date version of the mapping database.  This means that
     there are no delays, as in a LISP variant 2.x or 3.x "pull"
     model, which needs to request mapping information for packets
     with destination addresses it does not have mapping information
     for.   LISP variants are listed at:

     http://www1.ietf.org/mail-archive/web/ram/current/msg01289.html

     The delay in getting that mapping will usually result in such
     delays in delivering the packet that the host application or
     whatever it was which sent it is likely to assume that the
     packet was dropped.  Although ViP could work with a "pull"
     query and cache system rather than a full database in each ITR,
     I am proposing every ITR have a database which is up-to-date
     within a few seconds of any changes being made to the master
     database in some central, or distributed, control system.


There are no-doubt many pitfalls in all this.  I am only considering
unicast packets, not multicast.


Example of ViP-basic
--------------------

ASCII Art only works with fixed width fonts:


  Edge Network A
                          [ITR]
HA---(IR)----(BR1NA)------(VR1)-----(TR2)--(TR4)   Edge Network B
                    \         \                \
                     \         \                \        [ETR]
                      (TR1)----(VR2)---(TR3)---(BR1NB)---(IR)---HB
                         \             /
                          \           /
                         (TR5)----(TR6)
                                      \
                                       \           Edge Network C
                                        \ [ETR]
                                        (BR1NC)----HC
                                               \
                                                ---HD




HA's address is 11.11.11.11, which is part of a prefix which is
advertised in BGP and which is not covered by ViP mapping.

HB's address is 22.22.0.1, which is part of 22.22.0.0/16, which is
also advertised in BGP and which is ViP-mapped.  The other hosts'
addresses are:

  HC = 22.22.4.1
  HD = 22.22.4.2

There are three border routers and two V-routers.  In this initial
example of HA sending a packet to HB, the ITR function is performed
by VR1.  Both VR1 and VR2 also perform normal BGP transit router
functions.

Edge Network A has no ViP-equipped internal or border routers.

Edge Networks B and C have a router which performs ETR functions.
In this example, Edge Network B uses an internal router and Edge
Network C uses its border router.

Edge Networks A and C are singlehomed and B is multihomed.

The packet makes its way through Edge Network 1 to BR1NA which
recognises it's destination address as being within the BGP
advertised prefix 22.22.0.0/16.

Both VR1 and VR2 advertise this prefix 22.22.0.0/16.

BR1NA forwards the packet to VR1, because this is the closest.

If VR1 had become unreachable through that direct link, BR1NA would
have forwarded the packet to TR1 instead, which would have forwarded
the packet to VR2, which would have performed the same ITR function
as I will describe for VR1.

VR1 recognises that the destination address of this packet is for a
BGP advertised prefix which is covered by ViP mapping.  It therefore
uses the IP address to look up the IP address of the ETR the packet
should be tunnelled to.

While it would be possible to ViP-map an entire BGP-advertised
prefix to a single ETR, this is not how I intend it operate.  To do
so would not achieve any reduction in the global BGP routing table.

Instead, I intend that ViP-mapping be applied generally to rather
large (in terms of number of addresses) prefixes (that is, a short
prefix) and that each such prefix should have many fine divisions in
terms of which edge networks the ETRs are located in.

It is technically possible, although perhaps hard work for the ITR,
to have a different ETR for every IP address.  In this example, a
/16 has 65,536 IP addresses.  The mapping is, in part:

22.22.0.0 to 22.22.0.15   ---> Internal router of Edge Network B.

22.22.4.0 to 22.22.4.3    ---> Border router of Edge Network C.

In this way, a single BGP prefix could have its packets tunnelled to
thousands of edge networks.

VR1 IP-in-IP encapsulates the packet and forwards the packet
according to the BGP routing system for the address of the IR
internal router in Edge Network B.

When it arrives there, the Internal Router recognises it as a
tunnelled packet, strips off the outer IP header and then processes
it as it does any other packet.  This causes the packet to be
forwarded to HB.

In the return direction, it is vital that BR1NB allow packets with
the source address of HB into the BGP network.

Assuming there is an already existing system of V-routers to perform
ITR functions (actually, only one is required) then here is what the
administrators of Edge Network B need to do in order to gain
ViP-mapped addresses for hosts such as HB.

1 - They need to trust whoever runs the V-routers, which effectively
    means whoever has been assigned the prefix 22.22.0.0/16 - in
    this example AS7777.

    In terms of long-term stability, portability etc. of HB's IP
    address, everything depends on AS7777 retaining this prefix,
    and controlling the V-routers so that HB's IP address is
    correctly handled.

    AS7777 may charge a fee per IP address or via traffic volume
    to cover its costs - which are primarily running a bunch of
    fast, expensive, V-routers at various points in the DFZ, to
    provide robust coverage for all sources and destinations,
    without excessively longer routes than would be required if the
    packets were not being handled with ViP.


2 - They pay a fee, make an arrangement etc. by which they can
    tell AS7777's system the IP address of the ETR which they want
    packets tunnelled to, whenever those packets are addressed to
    HB's new ViP-mapped IP address.  In this case, Edge Network B
    gets 16 IP addresses from AS7777: 22.22.0.0/28.

    How the ViP-mapping works within a BGP-mapped subnet is a
    matter for AS7777 to determine.  There is not necessarily any
    reason to stick to binary boundaries.


3 - They set up a router to perform ETR functions.  The size and
    location of this router depends on how much traffic they are
    expecting on ViP-mapped tunnels.


4 - They configure their routers to handle this 22.22.0.0/28
    prefix.  They configure HB and any other hosts to have the
    appropriate addresses.  Part of this is ensuring that their
    border router will allow packets with this source address out to
    the BGP routing system.


HB's ViP-mapped address is reachable by all edge networks, since
every edge network will forward the packets to one of the V-routers.
 Edge Network A is multihomed, and this multihoming applies to the
way the packets will reach a nearby V-router in the most efficient
fashion.

Edge Network B is multihomed, and this applies to how tunnelled
packets are delivered to its ETR internal router.


Similar principles apply to Edge Network C, except that it is not
multihomed.

So far, unless I have made a mistake, we have seen how ViP can map a
single IP addresses and/or longer prefixes of a single, shorter
BGP-advertised prefix to hosts in a very large number of edge
networks, each of which needs at least one BGP-advertised prefix in
order that its ETR be reachable, and for packets in the reverse
direction.

Now imagine that HB is owned by company Big Inc.  The above example
would be the same, except that the actions and responsibilities
would be different.

1 - Edge Network B would have already installed an ETR, so its
    customers could use ViP-mapped addresses.

2 - Big Inc. would have dealt with AS7777 to obtain lasting rights
    to the prefix 22.22.0.0/28.  Edge Network B may have helped them
    do this.

3 - Big Inc. (perhaps with the help of Edge Network B) registers the
    address of Edge Network B's ETR with AS7777 as being the ETR
    for all its 16 IP addresses.


Now let's say Big Inc wants to move its HB to another Edge Network
E.  This involves:

1 - Edge Network E already having an ETR, and configuring it and its
    network to handle 22.22.0.0/28.

2 - Big Inc. using some kind of real-time, cryptographically
    secured, update system to change AS7777's system's ViP mapping
    for 22.22.0.0/28 to the address of the ETR of Edge Network E.

Edge Network B doesn't have to do anything.

So as long as AS7777 run their V-routers and their database as
promised, and as long as Big Inc. can find an ISP which will run an
ETR and configure their network accordingly, they can take their
AS7777 provided subnet to whichever such ISP they like.


Example of ViP-basic on a host-by-host basis
--------------------------------------------

It would be possible to implement the ETR function in HB directly.
This would enable HB to have its ViP-mapped IP address in any
supportive ISP.  This doesn't necessarily achieve much, since HB
still needs an ordinary BGP-mapped and routed IP address so it can
be its own ETR.

However, Big Inc does gain completely (at least to ETR supportive
ISPs) portable IP addresses, with multihoming (assuming the ISP is
multihomed) without burdening the global BGP routing table with
another advertised prefix.

It can also have a single host, performing its own ETR functions,
but operating multiple ViP mapped IP addresses, if that serves any
purpose.


Traffic Engineering
-------------------

In the first example, I think Edge Network B and Big Inc. could work
together to fine-tune the ViP-mapping of the one or more IP
addresses which are carrying a lot of traffic.  Edge Network B may
want to balance the load of incoming traffic across its two upstream
links.

One way of doing this would be to have two ETRs, each with an IP
address on a separately advertised BGP prefix.  One prefix would be
advertised on the link to TR4 and the other on the link to TR3.

Then, assuming both ETRs could send packets to HB, it would be
possible to control incoming traffic by altering the ViP mapping for
HB to one or the other of these ETRs.



ViP-Mobile
----------

The purpose of ViP mobile is to enable a single ViP mapped IP
address to be reached by virtually any Internet-connected host,
including any host which:

1 - Has a direct BGP-routable IP address, or:

2 - Is behind one or more NAT firewalls, but can initiate a
    tunnel session with one or more TTRs, or:

3 - Is on an IP address which is mapped with ViP-basic.

The idea is that the mobile host creates a two-way encrypted tunnel
to a TTR, with some mechanisms for:

1 - Choosing a single nearby TTR.

2 - Perhaps maintaining two tunnels with two TTRs, with a fancier
    ViP mapping system which adapts to tunnel from the ITRs to
    whichever TTR currently can reach the mobile host.

The TTR would function as an ETR for the purposes of packets being
sent to the mobile host, but it would send them to the host by a
second, encrypted and possibly compressed, VPN tunnel.

The TTR would receive packets from the host via the VPN tunnel, and
as a first instance, forward them so they were handled by the global
BGP routing system.  There, they would either be sent directly to
their destination.  Alternatively, if the destination was a
ViP-mapped address, they would be forwarded to the nearest V-router
and tunnelled from there.

The TTR could also function as a ViP ITR for packets to ViP-mapped
addresses.  However, being a ViP ITR is an onerous task, even for
small packet volumes, unless it uses a pull approach to mapping,
which enables it to maintain only a partial version of the mapping
database.

There is obviously a lot more to think about with mobility.
Ordinarily the mobile host can use a tunnel to one active TTR.
Ideally this is near the mobile device - which is why a global
network of V-routers which can be ITRs and TTRs would be a good thing.

The idea is to have very light-weight mobility, with the mobile
device automatically finding a nearby TTR, authenticating itself,
and being online without any arrangements with whichever ISP it is
currently connected to.



Placement of V-routers (ITRs)
-----------------------------

One way of implementing ViP is to have a single global system, in
which case, there could be two models, at least:

1 - No fee associated with the mapping of IP addresses or traffic.

    In this case, there needs to be some reason why people such as
    transit providers and ISPs want to run V-routers without being
    paid for the traffic they must handle.

2 - A single global system with some kind of fee for addresses
    and/or traffic.

    This raises questions of charging and billing, and distribution
    of monies.


Although a single ViP system for the whole world is attractive in
terms of elegance, if the idea is a good one, there is nothing I
know of which would stop any organisation setting up its own ViP
network.  All it needs is an RIR's agreement to use assigned IP
addresses in this way, and then the resources to install and run
bunch of V-routers around the world.

There could be significant revenue opportunities in privately run
ViP systems, as long as there was a well standardised approach to
the ETR function, which needs to operate on the border routers or
internal routers of many ISPs, before multiple ViP systems would be
viable.

With many or most ISPs running one or more ETRs for free, as part of
their general service to customers, it would be possible for
multiple competing ViP systems to work alongside each other, each
with one or more largish assignments of BGP-routable address space
to split up.

If there were four such competing systems, with similar or identical
technical principles of operation, there would be opportunities for
other companies, such as ISPs, to run a single router which performs
the ETR functions for one, two, three or all four of the competing
global systems - probably for a fee.

An ISP may want to create its own ETR function at its border routers
or very close to them, in order to minimise or eliminate outgoing
and incoming traffic for ViP-mapped addresses within its own
network.  For instance, an ISP in Australia might have a ViP-mapped
host on its Sydney network, but doesn't want to make all the routers
in its other cities aware of that IP address.  Hosts in other cities
would communicate with the Sydney host via a V-router ITR function,
so it would be best if the ISP itself had a V-router in each city,
so the packets don't escape to the wider Net en-route to the nearest
anycast V-router.


_______________________________________________
RAM mailing list
RAM@iab.org
https://www1.ietf.org/mailman/listinfo/ram