Re: [Idr] draft on virtual aggregation

Robert Raszuk <> Fri, 11 July 2008 07:10 UTC

Return-Path: <>
Received: from [] (localhost []) by (Postfix) with ESMTP id 1E7F43A68AF; Fri, 11 Jul 2008 00:10:51 -0700 (PDT)
Received: from localhost (localhost []) by (Postfix) with ESMTP id 2560D3A68AF for <>; Fri, 11 Jul 2008 00:10:50 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id n4+PckRcl2-I for <>; Fri, 11 Jul 2008 00:10:48 -0700 (PDT)
Received: from ( []) by (Postfix) with ESMTP id EF8B03A681B for <>; Fri, 11 Jul 2008 00:10:47 -0700 (PDT)
Received: from source ([]) by ([]) with SMTP; Fri, 11 Jul 2008 00:10:56 PDT
Received: from ([]) by with Microsoft SMTPSVC(6.0.3790.3959); Fri, 11 Jul 2008 00:10:57 -0700
Received: from ([]) by with Microsoft SMTPSVC(6.0.3790.3959); Fri, 11 Jul 2008 00:10:57 -0700
Received: from ([]) by with Microsoft SMTPSVC(6.0.3790.3959); Fri, 11 Jul 2008 00:01:19 -0700
Received: from [] ([]) by (8.11.3/8.11.3) with ESMTP id m6B71Ix29318; Fri, 11 Jul 2008 00:01:18 -0700 (PDT) (envelope-from
Message-ID: <>
Date: Fri, 11 Jul 2008 00:01:15 -0700
From: Robert Raszuk <>
User-Agent: Thunderbird (Windows/20080421)
MIME-Version: 1.0
To: Paul Francis <>
References: Your message of "Mon, 07 Jul 2008 12:52:53 EDT." <> <> <>
In-Reply-To: <>
X-OriginalArrivalTime: 11 Jul 2008 07:01:19.0476 (UTC) FILETIME=[EB84AB40:01C8E323]
Subject: Re: [Idr] draft on virtual aggregation
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Inter-Domain Routing <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
Content-Transfer-Encoding: 7bit
Content-Type: text/plain; charset="us-ascii"; Format="flowed"

Hi Paul,

 > In the meantime, however, FIB size alone is an immediate problem for a
 > lot of ISPs, because it is specifically this that forces them to
 > upgrade hardware when they otherwise don't need to do it.

As Curtis pointed out deployment of any form of tunneling in the core 
MPLS or IP ultimately addresses the FIB scaling of that part of the 

True that some networks do not have the core ... the network can be 
meshed edges or more specifically meshed POPs.

A very simple observation can be made that you can use a tunnel from the 
edge to one router (per POP for example) do IP lookup then encapsulate 
to the exit point. In that scenario your edge routers are free from 
carrying full table and due to shift in the place where single IP lookup 
is done and switching decision determined.

It is clearly not true that vendors today have any issues in delivering 
boxes which could keep today's Internet table and at least allow for 
5-10 time it's grow. I know at least two of them which have been 
shipping such routers for few years now.

And such architecture does not require dividing address space in any 
chunks and can be deployed today on any exiting hardware without waiting 
for any new protocol extensions.


> It is true that there are a number of routing problems that ultimately need
> to be solved...RIB and FIB size, convergence time, security, multi-homing.
> RRG is working on a single grand solution to all of this, but I'm not holding
> my breath.
> In the meantime, however, FIB size alone is an immediate problem for a lot of
> ISPs, because it is specifically this that forces them to upgrade hardware
> when they otherwise don't need to do it.  This is evidenced by the various
> hacks that ISPs currently employ to extend the life of their routers.  One is
> the "disconnected backbone" arrangement that Dan Ginsburg mentioned in his
> email.  Another is to simply ignore /24's that you don't have enough space
> for (see for instance
> I'm even familiar with a Tier1 ISP (I'm not at liberty to say who) that is
> actually using a hack that messes up the AS-path, and could create loops if
> another ISP did the same thing).  
> So this is already a problem that needs fixing.  Worse, as IPv4 addresses run
> out, there is real concern that addresses will become less aggregatable, and
> I'd like to have something in place should that happen.  VA represents about
> the simplest architecturally sound solution I can imagine.
> I guess you are suggesting that, since RAM is "cheap and dense", FIB really
> isn't the problem?  I personally don't know router architecture very well,
> but if what you say is true, why don't router vendors simply build routers
> with more FIB?  I suppose you could argue that they try to, but that they
> under-estimate DFRT growth?  This doesn't sound likely.  Tony Li has argued
> (see RFC4984) that because router memories are built in volume, Moore's law
> doesn't really apply.  This all suggests to me that there is a real cost to
> huge FIBs. 
> As for operational complexity, I don't think that you or I know if it is
> "too" much or not.  Really it is a question of the value ISPs get out of this
> versus the difficulty of doing it.  I'm not going to pretend that running VA
> is trivial, but nor does it strike me as all that bad.  It strikes me for
> instance as simpler than route reflectors (though that ain't saying much!).
> And given that ISPs already deal with config complexity in the hacks I
> mention above, there is a good chance in my mind that VA will be an
> acceptable solution.
> Regarding some specific points in your email:
> Regarding core-router MPLS fix:  To be clear, VA isn't about reducing *core*
> router FIB size, it is about reducing FIB size in *any* routers, all of them
> if that's what you want to do.  Since most routers aren't core routers,
> solutions that only address core routers are only so useful.  I've been told
> by folks at ISPs that the greater concern are edge routers.
> Regarding on-chip FIBs:  I've been told by Tony Li that he believes he can
> fit 200K FIB entries on-chip.  VA can get you 5x reduction very easily, and
> with some deployment creativity (which would take time and experience to
> development) I'll be we could do much much better better.  But that is
> somewhat besides the point...the goal here is not to get to single-chip
> routers (though that might be where this leads us anyway), but to allow ISPs
> to extend the lifetime of their routers, which is something that at least
> some of them clearly want to be able to do (see below).
> Thanks,
> PF
>> -----Original Message-----
>> From: []
>> Sent: Wednesday, July 09, 2008 11:00 AM
>> To: Paul Francis
>> Cc:
>> Subject: Re: [Idr] draft on virtual aggregation
>> In message
>> <>
>> Paul Francis writes:
>>> Gang,
>>> At the following URL is a draft on virtual aggregation that I'm
>> posting to
>>> IETF (it'll show up in a day or two), and which I'll present at IDR
>> in
>>> Dublin.
>> 00.txt
>>> Title and abstract are below.  I hope to create a work item on this
>> in IDR.
>>> I would characterize this as falling under the general charter of
>> scaling
>>> BGP.
>>> Any comments and discussion on this prior to Dublin is of course
>> greatly
>>> appreciated.
>>> PF
>>> Title:  Intra-Domain Virtual Aggregation
>>>    Virtual Aggregation (VA) is a technique for shrinking the DFZ FIB
>>>    size in routers (both IPv4 and IPv6).  This allows ISPs to extend
>> the
>>>    lifetime of existing routers, and allows router vendors to build
>> FIBs
>>>    with much less concern about the growth of the DFZ routing table.
>> VA
>>>    does not shrink the size of the RIB.  VA may be deployed
>> autonomously
>>>    by an ISP (cooperation between ISPs is not required).  While VA
>> can
>>>    be deployed without changes to existing routers, doing so requires
>>>    significant new management tasks.  This document describes changes
>> to
>>>    routers and BGP that greatly simplify the operation of VA.
>>> _______________________________________________
>>> Idr mailing list
>> Paul,
>> Is there a need?
>>   Are we still trying to do the equivalent of keeping an AGS+ with
>>   DFRT alive somewhere?  RAM is cheap and dense.  To get to on-chip
>>   RAM would require orders of magnitude reductions in DFRT size.
>>   Other techniques exist for dramatically reducing core router FIB
>>   size if that becomes a goal for a provider.
>>   For example, MPLS (or GRE) tunneling through a BGP free core reduces
>>   FIB size to about the size of the IGP (should easily fit in on-chip
>>   memory).  It requires no protocol change.  Only down side is no ICMP
>>   when tunnel faults in middle prior to the ingress knowing about it
>>   (usually the case anyway due to VPN and VRF) and no fallback to IP
>>   when ingress knows that the tunnel is down and hasn't yet rerouted.
>> Is the solution worse than the problem?
>>   This seems too operationally problematic.
>> Curtis
> _______________________________________________
> Idr mailing list

Idr mailing list