Re: Comments on alternatives

Curtis Villamizar <curtis@ans.net> Thu, 09 March 1995 03:29 UTC

Received: from ietf.nri.reston.va.us by IETF.CNRI.Reston.VA.US id aa18317; 8 Mar 95 22:29 EST
Received: from maelstrom.acton.timeplex.com by IETF.CNRI.Reston.VA.US id aa18302; 8 Mar 95 22:28 EST
Received: from curtis.ansremote.com (curtis.ansremote.com [152.161.2.3]) by maelstrom.acton.timeplex.com (8.6.9/ACTON-MAIN-1.2) with ESMTP id WAA06572 for <rolc@maelstrom.timeplex.com>; Wed, 8 Mar 1995 22:25:34 -0500
Received: from curtis.ansremote.com (localhost [127.0.0.1]) by curtis.ansremote.com (8.6.9/8.6.5) with ESMTP id WAA01406; Wed, 8 Mar 1995 22:04:38 -0500
Message-Id: <199503090304.WAA01406@curtis.ansremote.com>
To: Joel Halpern <jhalpern@newbridge.com>
cc: rolc@maelstrom.timeplex.com
Reply-To: curtis@ans.net
Subject: Re: Comments on alternatives
In-reply-to: Your message of "Wed, 08 Mar 1995 08:17:02 +0500." <9503081317.AA16436@lobster.Newbridge.COM>
Date: Wed, 08 Mar 1995 22:03:23 -0500
Sender: ietf-archive-request@IETF.CNRI.Reston.VA.US
From: Curtis Villamizar <curtis@ans.net>

In message <9503081317.AA16436@lobster.Newbridge.COM>OM>, Joel Halpern writes:
> Having posted some of the difficulties to the list, I certainly agree that
> there are some problems with our transit-router to transit-router
> behavior.
> 
> There are several possible approaches to the solution.
> 
> 1) Rifs/BGP Queries are a methodology outlined in an ID.  The primary
>     concern I have with this particular methodology is its interaction
>     with multiple levels of aggregation.  If there are multiple levels,
>     one is required to make several queries and maintain several
>     "relationships" with aggregating routers.  This seems to be significant
>     overhead.

There is a tradeoff here.  If an aggregation occurs at router A and
the aggregates are passed to B and then C and D to N sources S_1 to
S_n, then either the sources can send RIFs to their immediate peers
who can pass RIFs back, or the sources can establish a direct peering
peering with the aggregator.

If the aggregation is performed early in the process (at router A) and
there is a very large number of sources S_1 to S_n, then the number of
connections to A will become too large.  If so, then it becomes
desirable for RIFs to be passed up the chain rather than to the
aggregator.  If there are a very large number of sources, then it may
also be true that B and possibly also C and D have a very large subset
of full routing and the state overhead of RIFs plus the processing
overhead of exchanging RIFs might be high enough that it makes sense
to just give B and possibly C and D full routing and let them
aggregate.

IMO it makes most practical sense to do the aggregation very close to
the (traffic) sources in this example (do aggregation at routers C and
D) and give routers within the network full routing.  It might make
the most sense to keep full routing (without further aggregation of
off NBMA compponents) in all of the routers and do aggregation at the
borders to accomodate border routers than can't handle full routing
(aggregating to a default route for single homed customers) or doing
any proxy aggregation that becomes proxy (this would be prxoy
aggregation only since the home AS in this case is by definition off
NBMA and if aggregation were possible before reaching the NBMA, it
would have been done already).

The host or router behaviour is then quite simple.  Follow the route
to the next hop, unless RSVP indicates a reservation is desired (or
whatever the chosen indication that a direct connection is preferred).
When a direct path is needed, a RIF is sent upstream to the
aggregator, which is the immediate next router for which a routing
session already exists.  This NBMA internal router installs the RIF,
and immediately starts keeping the border router up to date on routing
for the component route that was requested (without inundating the
border router with full routing).

At some point aggregation will have to be moved farther back to reduce
the routing load on the routers internal to the NBMA.  Right now there
is the potential for a very number of routers to come to the NBMA with
a very large number of routes with fairly modest gains due to
additional aggregation and the routers internal to the NBMA can handle
full routing.  This will change as more end customers (presumably
corporations) attach directly to the NBMA (less so for Juha).

There is considerable disagreement as to the date when OC3 ATM
attachments are as commonplace as telephone handsets.  ;-)

As the need to move aggregation back increases, we can revisit the
choice to connect to the aggregator and consider whether the
complexity of closer to full routing in the intermediate routers
outweighs the number of routing connections that the the aggregator
will have to support.

I suspect that for the longer term solution an NBMA will be
partitioned into major heirarchies and subhierarchies.  At the TOP
level hierarchy, the load will be greatest.  It might still make sense
to aggregate in the top hierarchy but have separate aggregators
feeding subhierarchies to keep the load down.  Within a subhierarchy
an aggregator closer to the destinations being aggregated will have to
serve some number connections likely to be proportional to the number
of destinations being aggregated and how active those destinations
are.

The routing state can be expected to be relatively static compared to
the change in the needs for routing (the exchange of RIFs).  That is
why I favor aggregation close to the source (traffic source that is
receiving the routing information) and only if the state load in the
core gets too high, aggregation near the destination (traffic
destination) with RIF exchange and routing updates not passed through
core, but rather exchanged directly with the aggregator.  This keeps
the core from incurring work when direct connections are being set up
anywhere in the NBMA, which passing RIFs and routing up and down the
routing hierarchy would require (and which NHRP will require).

> 2) NHRP with state exchange could be used in certain situations.  If both
>     ends are BGP routers (or both are intra-domain routers within the same
>     domain) a degenerate exchange between the two will allow the detection
>     of routing loops, and their removal.

Seems to me like arm waving.  I don't know of any mechanism in NHRP to
accomplish this.  I think we established that full routing attributes
are needed to prevent the loop.  Bottom line is then again to exchange
full routing.

> 2a) In order to do this however, we would also have to specify what happens
>     when the query crosses the intra/inter border.  Is it terminated.  Is
>     the query propagated, and the response replaced if it turns out to be
>     router-router?  (There are enough bits to tell this.)  Or is there
>     something else to do in this case.

For this case you need to have a routing peering, even if indirect and
you can't be trusting the NHRP data.  We've established this.

> 3) Or should we punt the whole thing back to a query that ONLY works for
>     host resolution.  I personally would like a mechanism which worked
>     for host-host, host-router, and appropriate router-router if that can
>     be determined safely and reliably.

YES.  It only works for resolution of addresses that are directly on
the NBMA.  It will also work for destinations behind single homed
routers if that router claims that destinations behind it are on the
local media and then makes it appear as though its physical interface
resolves the destination address query.  In other circles this is
called a proxy ARP and its safe application is quite limited.  The
names of the bits have changed, but it is still a proxy address
resolution operation.

> Thank you,
> Joel M. Halpern				jhalpern@newbridge.com
> Newbridge Networks Inc.

Curtis