Re: [Idr] draft on virtual aggregation

Paul Francis <francis@cs.cornell.edu> Fri, 01 August 2008 02:11 UTC

Return-Path: <idr-bounces@ietf.org>
X-Original-To: idr-archive@megatron.ietf.org
Delivered-To: ietfarch-idr-archive@core3.amsl.com
Received: from [127.0.0.1] (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 3DF1A3A695C; Thu, 31 Jul 2008 19:11:24 -0700 (PDT)
X-Original-To: idr@core3.amsl.com
Delivered-To: idr@core3.amsl.com
Received: from localhost (localhost [127.0.0.1]) by core3.amsl.com (Postfix) with ESMTP id 968153A6949 for <idr@core3.amsl.com>; Thu, 31 Jul 2008 19:11:22 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.32]) by localhost (core3.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id oECosLf9TaYW for <idr@core3.amsl.com>; Thu, 31 Jul 2008 19:11:21 -0700 (PDT)
Received: from exch-hub2.cs.cornell.edu (mail-hub-2.cs.cornell.edu [128.84.103.139]) by core3.amsl.com (Postfix) with ESMTP id EA2DB3A6938 for <idr@ietf.org>; Thu, 31 Jul 2008 19:11:20 -0700 (PDT)
Received: from EXCHANGE1.cs.cornell.edu (128.84.96.42) by mail-hub.cs.cornell.edu (128.84.96.245) with Microsoft SMTP Server id 8.0.813.0; Thu, 31 Jul 2008 22:11:39 -0400
Received: from EXCHANGE2.cs.cornell.edu ([128.84.96.44]) by EXCHANGE1.cs.cornell.edu with Microsoft SMTPSVC(6.0.3790.3959); Thu, 31 Jul 2008 22:11:38 -0400
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-Class: urn:content-classes:message
MIME-Version: 1.0
Date: Thu, 31 Jul 2008 22:11:24 -0400
Message-ID: <37BC8961A005144C8F5B8E4AD226DE1110DAA7@EXCHANGE2.cs.cornell.edu>
In-Reply-To: <15B86BC7352F864BB53A47B540C089B605E92736@xmb-rtp-20b.amer.cisco.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [Idr] draft on virtual aggregation
Thread-Index: AcjmjMyrelqXdAYdSYqNZy48D3uorwBnTtvAAtQZ1hA=
References: Your message of "Fri, 11 Jul 2008 06:39:57 EDT."<37BC8961A005144C8F5B8E4AD226DE1109D860@EXCHANGE2.cs.cornell.edu> <200807151457.m6FEv4Cs032524@harbor.brookfield.occnc.com> <15B86BC7352F864BB53A47B540C089B605E92736@xmb-rtp-20b.amer.cisco.com>
From: Paul Francis <francis@cs.cornell.edu>
To: "Rajiv Asati (rajiva)" <rajiva@cisco.com>, <curtis@occnc.com>
X-OriginalArrivalTime: 01 Aug 2008 02:11:38.0681 (UTC) FILETIME=[EE6E9290:01C8F37B]
Cc: idr@ietf.org
Subject: Re: [Idr] draft on virtual aggregation
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.9
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/pipermail/idr>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: 7bit
Sender: idr-bounces@ietf.org
Errors-To: idr-bounces@ietf.org

I've been thinking about this, so let me give a more thoughtful answer.

As I said, if you want to be able to do FIB reduction in edge routers whose
eBGP peers require the full DFZ, then either those edge routers must have the
full RIB, or some other router must do the peering (multi-hop).

So one limitation that you've bought yourself is that at least one router has
to have a full FIB.  This might be ok in many ISPs, but nevertheless there it
is.  With VA FIB reduction, all routers can have smaller FIBs.

When you shift peering to in internal router, then that router will have more
peers, and therefore bigger RIB and processing requirements.

You have to configure this internal router to give the appropriate routes to
the edge routers.

If the internal router doing the peering crashes, then a bunch of issues come
up.  Of course, connectivity with the peers is still good, so you need
another router to do the peering.  Without some kind of hot standby scheme,
this would require restarting all the BGP peering sessions.  Of course, it
requires more configuration, and that the internal routers monitor each
other's aliveness so as to know when to start peering.  In short, you step
away from the traditional model whereby the data path and the control path
(routing messages) are closely coupled, and there are a whole chain of
complexities that kick in that we don't know well how to deal with.

Now, if you keep the full RIB, then you get to run routing protocols as we've
run them for 25 years...no change at all.  You just add a new function that
decides what to install and what to suppress from the FIB.  Much of the VA
spec basically describes what these rules are (and they are quite simple).
You are absolutely correct in that torques how to debug...as I said now the
admin needs to know what was suppressed, and where packets should flow.  Plus
there are new failure modes (aggregation point crashing) so you need to
provision carefully, and there is some new config.  But overall it is much
cleaner, IMHO, than stepping away from the traditional packets-follow-control
routing model we've always had. 

I wasn't being entirely facetious when I suggested that all you need to do to
see how complex RIB reduction is is to look at the RRG proposals.  It is a
fundamentally hard problem, and we to date don't have good answers.  FIB
reduction allows you to avoid that path.

PF


> -----Original Message-----
> From: Rajiv Asati (rajiva) [mailto:rajiva@cisco.com]
> Sent: Thursday, July 31, 2008 9:07 AM
> To: curtis@occnc.com; Paul Francis
> Cc: idr@ietf.org
> Subject: RE: [Idr] draft on virtual aggregation
> 
> 
> The idea to keep FIB out-of-sync with RIB is a bit discomforting when
> the troubleshooting has to be done for the traffic outage. Also, this
> unnecessarily increases the complexity of the RIB and FIB interaction.
> 
> For a lot of network operators, it may be cleaner to not even install
> the qualified BGP paths into the RIB. FIB suppression would happen
> automatically.
> 
> Note that the decision to advertise the BGP paths to the downstream BGP
> speaker can be independent of whether the paths are suppressed or not
> in
> the RIB/FIB. This may be proposed by this specification.
> 
> Perhaps, you could clarify the rationale/advantages for keeping the
> routes in the RIB. Thanks.
> 
> Cheers,
> Rajiv
> 
> 
> > -----Original Message-----
> > From: idr-bounces@ietf.org [mailto:idr-bounces@ietf.org] On
> > Behalf Of Curtis Villamizar
> > Sent: Tuesday, July 15, 2008 10:57 AM
> > To: Paul Francis
> > Cc: idr@ietf.org
> > Subject: Re: [Idr] draft on virtual aggregation
> >
> >
> > In message
> > <37BC8961A005144C8F5B8E4AD226DE1109D860@EXCHANGE2.cs.cornell.edu>
> > Paul Francis writes:
> > >
> > > To be clear, we are talking about one new attribute, zero changes
> to
> > > the data plane, zero changes to the existing BGP decision
> > > process....just some rules for automatically setting up tunnels and
> > > new address aggregates (virtual prefixes).  Better to do
> > this now well
> > > before the next generation of routers runs out of FIB.
> > >
> > > PF
> >
> >
> > Paul,
> >
> > Most providers amortize their routers in three years but keep them in
> > service for five or more.  Typical growth rates in healthy providers
> > are a doubling in about 1.5-2 years with some providers reporting 1
> > year (unconfirmed).  They keep routers in service by moving them
> > closer to the edge where the lower capacity of the router is less of
> > an issue, sometime redeploying from major cities to lesser cities.
> >
> > A smart provider looks at their current default free routing size and
> > looks for at least 2 and better 4 or more times that in FIB and RIB
> > capacity, with RP memory size also dictated by the number of BGP
> peers
> > and peer groups that are expected to be supported.
> >
> > Most of the providers with very large FIB and RIB are those top tiers
> > that do not do a good job of aggregating the routes for their own
> > infrastructure.  To aggregate well, they have to first allocate
> blocks
> > of addresses by POP and also subdivide their network into areas and
> > aggregate at area boundaries (possibly the only functionality where
> > confederations may be more straightforward and less error prone than
> > RR, but that is another topic).
> >
> > If my memory serves me correctly, the target for major router vendors
> > (dictated by certain tier-1 providers) was over 1 million circe late
> > 1990s, about 2 million early 2000 and some asked for as much as 4
> > million just to have headroom (and got it from some vendors).
> >
> > RAM is cheap.  Once you go off chip (RAM off the forwarding ASIC)
> > memory bandwidth is much more an issue than memory size.
> >
> > The problem is mainly "enterprise switch/routers" with on chip CAM
> and
> > TCAM and no provision for off chip RAM that have been a problem.  To
> a
> > lesser extent a few routers inteded as large enterprise routers or
> > default free provider routers will now require that you replace the
> > forwarding cards.
> >
> > IMHO again: I think this is not a hack that IDR should pursue.  But I
> > have mostly worked with tier-1 providers and I am open to other
> > opinions.  Lets hear from some providers on this.
> >
> > Curtis
> > _______________________________________________
> > Idr mailing list
> > Idr@ietf.org
> > https://www.ietf.org/mailman/listinfo/idr
> >
_______________________________________________
Idr mailing list
Idr@ietf.org
https://www.ietf.org/mailman/listinfo/idr