[RAM] 1 Ivip ITR strategies, including in the host

Robin Whittle <rw@firstpr.com.au> Mon, 02 July 2007 14:31 UTC

Return-path: <ram-bounces@iab.org>
Received: from [127.0.0.1] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1I5MwZ-0003x4-9A; Mon, 02 Jul 2007 10:31:59 -0400
Received: from [10.91.34.44] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1I5MwY-0003wz-6p for ram@iab.org; Mon, 02 Jul 2007 10:31:58 -0400
Received: from gair.firstpr.com.au ([150.101.162.123]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1I5Mvi-00061e-A2 for ram@iab.org; Mon, 02 Jul 2007 10:31:58 -0400
Received: from [10.0.0.9] (dell.firstpr.com.au [10.0.0.9]) by gair.firstpr.com.au (Postfix) with ESMTP id E10E959DA1; Tue, 3 Jul 2007 00:30:58 +1000 (EST)
Message-ID: <46890CEB.6030807@firstpr.com.au>
Date: Tue, 03 Jul 2007 00:34:19 +1000
From: Robin Whittle <rw@firstpr.com.au>
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.13) Gecko/20060414
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: ram@iab.org
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
X-Spam-Score: 0.0 (/)
X-Scan-Signature: 96e0f8497f38c15fbfc8f6f315bcdecb
Subject: [RAM] 1 Ivip ITR strategies, including in the host
X-BeenThere: ram@iab.org
X-Mailman-Version: 2.1.5
Precedence: list
List-Id: Routing and Addressing Mailing List <ram.iab.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/ram>, <mailto:ram-request@iab.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/ram>
List-Post: <mailto:ram@iab.org>
List-Help: <mailto:ram-request@iab.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/ram>, <mailto:ram-request@iab.org?subject=subscribe>
Errors-To: ram-bounces@iab.org

Here are some ideas in 5 messages, which will make most sense if
read in this order.  I wrote multiple messages so any discussion
can be specific to each one.

1 Ivip ITR strategies, including in the host

2 Ivip ETR strategies, including in the host

3 Database & ITR update rates, mobility etc.

4 User-interface and delegation tree for central database
  (LISP/Ivip)

5 Database <--> ITR push, pull and notify


Here is an idea for combining numerous caching (pull) ITRs with
one or a few full-database (push) ITRs, to achieve generally
optimal paths whilst not delaying the first packets - as would
usually be the case with a caching ITR.

I also discuss having a caching ITR function as part of host
operating systems, to reduce the load on ITRs in the edge network
or beyond.

Probably a lot of what I write would also apply to LISP, however I
will only refer to Ivip, since I still don't have a clear idea of
what LISP 1.5 involves.

Most of the people on this list know vastly lot more about running
routers, networks etc. than I do, so please point out any things I
seem to be missing.

In Ivip I do not anticipate the ITRs having any built-in Traffic
Engineering functions, such as would be required by the LISP I-D's
Priority and Weight (p15) variables.  That would be very costly to
implement.  An Ivip ITR can map any IPv4 address to any other IPv4
address (subject to config file limitations), or  to having the
packet dropped.  This should mean the encapsulation can be done by
conventional FIB mechanisms, without involving the router's
central CPU at all.  Traffic engineering, in the form of load
balancing, could be done for a group of IP addresses using two
links by mapping some addresses so their packets are tunneled to
the ETR which belongs to one ISP and the others to be tunneled to
the ETR of the other ISP, with two links to the two ISPs thereby
having a chosen load, depending on the traffic of each mapped IP
address.


Let's say there is an edge network which wants to be totally hip
to Ivip.  Ideally, there would be ITRs all over the network.
Ideally, each ITR would have a full copy of the database, which it
 needs to get via a push mechanism.  However this will be costly
in terms of traffic flow to the ITR and in terms of its ability to
decode the updates, store them, and write them to its FIB (which
must have a huge capacity) as they arrive.

Let's say the edge network is rather large, and the operators only
want to have a single full database ITR.  The operators want to
have a few hundred pull ITRs, all over their edge network.  These
query some internal or external system - as I explore in message
"5  Database <--> ITR push, pull and notify".

I am trying to find a way that these caching (pull) ITRs can pass
packets in the mapped address ranges (meaning they are addressed
to one of the "master-subnets" which the Ivip system handles by
encapsulating and tunneling to ETRs) if it doesn't already have
some mapping information.  If it does have mapping for the address
of the packet, the caching ITR's FIB will encapsulate the packet
and send it on its way to the ETR, wherever this may be.

Let's say these caching ITRs have some mechanism for detecting the
fact that a significant number of packets are being sent to an
address range which is covered by the Ivip mapping system, but for
which it has not yet asked for mapping information about.  Ideally
this would be a counter per IP address, but that would be
extremely unwieldy or impossible, so perhaps some simpler system
such as a sampling scheme which examines 1% of the incoming
packet's destination addresses (when the router's CPU has nothing
else to do).  Then an algorithm searches for two packets in the
last few minutes which are within the Ivip mapped range but which
the ITR has not yet asked for mapping information.

Then the ITR can ask for the mapping information for this address
and update its FIB whenever it arrives.  If the address is part of
a larger subnet which has the same mapping, the response will say
so and so the FIB response for that subnet will be in place for
future packets.

A distributed system for handling mapping requests such as that of
draft-meyer-lisp-cons could take many seconds to get a reply back
to this ITR.  Even with the snappier system I describe in Message
5, I think we don't want to delay the packets for which the
caching ITR doesn't yet have mapping information for.  Nor we do
want to insist the caching ITR encapsulates every packet.

For now, we can assume that the delay in getting mapping via query
and response is so long that it will be unacceptable to delay all
packets which require mapping for this time.

The task then is to organise the ITRs so that every packet gets
handled by an ITR, with the "novel" ones (those the caching ITR -
the first ITR the packet is handled by - doesn't have mapping for)
being handled by some probably longer path than is ideal, while
the bulk of the traffic has no path delays, because the local
caching ITRs have already got mapping data for the destination
addresses of these "bulk" packets.

Conceptually, what I am thinking of is thousands of hosts which
are sending packets.  Some of these packets need to be handled by
an ITR because they are to addresses within the Ivip-mapped
"master-subnets".  There are two classes of ITR for this part of
the discussion:

  ITRC  With cache of mapping data derived from responses to
        queries of the database by some means.  (Pull.)

  ITRD  Full copy of the database, because it receives a
        continual stream of updates. (Push.)

  IR    Ordinary internal router.

  BR    Border router - connects to the global BGP system.

I assume these functions are performed by routers which also do
all the other things routers are expected to do in such a location.

This plan is all inside the edge network:

................
                .
 Edge network    .
                  .
            ITRD   .    }
            /  \   .  / }
H--\       /    \  . /  }
    \     /      \ ./   }
H----ITRC0--------BR--- }
    /  |  \      / .\   }
H--/   |   \    /  . \  }
       |    \  /   .  \ }
       |     IR    .    }
       |    /  \   .  / }
H--\   |   /    \  . /  }
    \  |  /      \ ./   } BGP transit &
H----ITRC1--------BR--- } border routers
    /  |  \      / .\   } of the Internet
H--/   |   \    /  . \  }
       |    \  /   .  \ }
       |     IR    .    }
       |    /  \   .  / }
H--\   |   /    \  . /  }
    \  |  /      \ ./   }
H----ITRC2--------BR--- }
    /  |  \      / .\   }
H--/   |   \    /  . \  }
       |    \  /   .  \ }
       |     IR    .    }
       |    /  \   .  / }
H--\   |   /    \  . /  }
    \  |  /      \ ./   }
H----ITRC3--------BR--- }
    /              .\   }
H--/               . \  }
                   .  \ }
                  .     }
..................      }

Fig 1.

Plan A for Fig 1. is that every caching ITRC has its FIB set up so
that packets addressed to every IP address in the "master-subnets"
it does Ivip mapping for will either be encapsulated and tunneled
to the ETR near the host with the mapped address, or will be
encapsulated and tunneled to a single IP address of the full
database ITRD (at the top of the diagram).

This way, all packets are Ivip mapped, but those which are "novel"
to the first ITR they reach will probably go on a longer than
optimal path via ITRD, before being encapsulated and tunneled to
the proper ETR.

ITRD needs to be expecting encapsulated packets arriving on one of
its IP addresses, but it would be easy for it to pop off the
IP-in-IP header and then put the packets through its FIB.

Ideally, this will be a pretty small proportion of traffic to
Ivip-mapped addresses - and the main traffic flows through the
internal routers (in this example, the first internal routers are
all ITRCs but these ITRCs don't need to be the closest router to
the hosts - as I depict in Fig 2.) and then out via the nearest
border router.

Plan B is for the edge network's routing system to correctly
handle each ITRC spitting out a packet it knows is addressed to
part of an Ivip "master-subnet" but which it doesn't yet have
mapping data for - and the network forwarding that packet to the
one ITRD.

The internal routing system needs to ensure these "novel" packets
are always forwarded towards ITRD.  It would be acceptable or
desirable if they pass through one or more further ITRCs on their
way to ITRD.  If a packet did reach an ITRC which had mapping for
it, then that would be fine, because it would be tunneled from there.

Maybe it would work fine if each IRTC accepts packets addressed to
the Ivip master-subnets and forwards those which were not mapped
and encapsulated by its FIB on a link which leads them closer to
ITRD - which advertises these master-subnets.  For this to work,
it would be vital for none of the border routers to announce paths
for these master-subnets (unless border router sent the packets
towards the IRTD rather than out to the Internet).  The border
routers would announce paths for all the BGP announced prefixes
other than the Ivip master-subnets and those prefixes for which
this edge network is the destination.

So any packet sent from inside the network would eventually find
its way to an ITR.

Fig 1 could be applied to LISP.  Fig 2 can't be, because as far as
I know, LISP doesn't involve EIDs being part of prefixes which are
advertised in BGP.

Alternatively, in Fig 2. the edge network could have no full
database ITR and would rely on the closest ITR(s) (presumably an
ITRD) in the BGP system to handle packets which its local ITRCs
let pass without encapsulation.

This would be cheaper for the edge network, and would not require
a constant inflow of update data for an ITRD.

(See message 5 for how the ITRCs need some Query Server to request
mapping information from, and how this one or more Query Servers
needs to lead to a Query Server which has a full copy of the
database - ideally somewhere close, best of all in the edge network.)


................
                .
 Edge network    .
                  .
                   .
                   .  /
H--\               . /      /
    \              ./      /
H----ITRC0--------BR-----TR
    /  |  \      / .\      \
H--/   |   \    /  . \      \
       |    \  /   .  \      \
       |     IR    .  TR-----ITRD---
       |    /  \   .  /
H--\   |   /    \  . /
    \  |  /      \ ./       BGP transit &
H----ITRC1--------BR---     border routers
    /  |  \      / .\       of the Internet
H--/   |   \    /  . \
       |    \  /   .  \
       |   ITRC2   .  TR----
       |   /   \   .  /
H--\   |  /     \  . /       /
    \  | /       \ ./       /
H-----IR          BR-----ITRD---
    /  | \       / .\     /
H--/   |  \     /  . \   /
       |   \   /   .  \ /
       |   ITRC3   .  TR
       |   /   \   .  /
H--\   |  /     \  . /
    \  | /       \ ./
H-----IR          BR---
    /              .\
H--/               . \
                   .  \
                  .
..................

Fig 2.

This shows how all paths taken by packets generated by hosts will
need to pass through at least one ITRC before exiting a border router.


Each ITRC needs to have a way of querying a copy of the central
database, probably via some caching proxy, or perhaps by querying
a nearby ITRD's internal database.  However, I think any ITRD is
probably going to be busy and it would be better to have a simple
server, not a router, to take a feed of the update stream, decode
it into an up-to-date copy of the database, and then respond to
queries from the ITRCs.

A feature of both Fig 1 and Fig 2 is that there are a large number
of ITRCs.  The difference is the location of the IRTD is which
packets go to which are not mapped by the ITRCs, because the ITRC
hasn't yet made a query or received a response yet.

If an internal IRTD gets a feed of database updates, a local query
server or network of query servers could handle queries from local
ITRCs.

A smaller network which doesn't want to have either an ITRD
(expensive, because of its huge RAM and massive FIB capabilities)
or a query server receiving the full database updates, will need
to rely  on some external system to answer the queries of its ITRCs.


Ideally, there would be a way that every ITRC could automatically
discover:

1 - Two or more addresses to which mapping queries should be sent.

2 - How to handle packets for which it has not yet cached any
    mapping information.  For instance, what IP address to
    tunnel them to so they reach an ITRD, or some other way of
    handling them.

The ITRC would need to be able to discover how these change after
boot time too, so perhaps the information could come with a
caching time.


ITR function in hosts
---------------------

In other messages I refer to this as:

  ITFH  Ingress Tunnel Function in Host

This function is not really a router, since the host still only
has one interface.  It is just a function which encapsulates
packets so they are forwarded to the ETR.

To reduce the load on the edge networks ITRC routers, it would be
desirable for some, many or ideally all hosts to perform their own
ITRC function.  This is not such an onerous task, unless the host
is a server with a massive number of destination addresses for its
packets in recent minutes.

The ITRC could be added to the operating system, especially if
there was an IETF standard way it could find where to send queries
and what to do with packets it hasn't yet got mapping for.  This
is not at all essential for Ivip (or LISP) to work - but it would
remove the load on the edge network's ITRCs and ensure that
packets travelled on the completely optimal path towards the ETR.

ITRCs in the edge network cost money because they are routers,
with CPU, RAM and FIB resources devoted to Ivip mapping.  An ITFH
function in host operating software, or even in some of the
application software, would cost nothing and would be the fastest
and most efficient approach.

 - Robin         http://www.firstpr.com.au/ip/ivip/





_______________________________________________
RAM mailing list
RAM@iab.org
https://www1.ietf.org/mailman/listinfo/ram