[Idr] Re: BGP issues

Robert Raszuk <raszuk@juniper.net> Wed, 09 January 2008 13:10 UTC

Return-path: <idr-bounces@ietf.org>
Received: from [] (helo=stiedprmman1.va.neustar.com) by megatron.ietf.org with esmtp (Exim 4.43) id 1JCahY-0003cv-W4; Wed, 09 Jan 2008 08:10:36 -0500
Received: from [] (helo=ietf-mx.ietf.org) by megatron.ietf.org with esmtp (Exim 4.43) id 1JCahX-0003cl-3L for idr@ietf.org; Wed, 09 Jan 2008 08:10:35 -0500
Received: from exprod7og111.obsmtp.com ([]) by ietf-mx.ietf.org with esmtp (Exim 4.43) id 1JCahU-0006a2-NU for idr@ietf.org; Wed, 09 Jan 2008 08:10:35 -0500
Received: from source ([]) by exprod7ob111.postini.com ([]) with SMTP; Wed, 09 Jan 2008 05:10:27 PST
Received: from magenta.juniper.net ([]) by emailsmtp55.jnpr.net with Microsoft SMTPSVC(6.0.3790.1830); Wed, 9 Jan 2008 05:02:54 -0800
Received: from [] ([]) by magenta.juniper.net (8.11.3/8.11.3) with ESMTP id m09D2m912375; Wed, 9 Jan 2008 05:02:53 -0800 (PST) (envelope-from raszuk@juniper.net)
Message-ID: <4784C5EF.1070806@juniper.net>
Date: Wed, 09 Jan 2008 05:02:39 -0800
From: Robert Raszuk <raszuk@juniper.net>
User-Agent: Thunderbird (Windows/20071031)
MIME-Version: 1.0
To: Iljitsch van Beijnum <iljitsch@muada.com>
References: <200801061814.m06IE8920382@magenta.juniper.net> <22AB40E0-1660-4B9D-BA74-B1CB98EB0882@cisco.com> <47813E05.2050602@juniper.net> <A627A8DF-42D2-4701-A6D5-1C8102537A41@cisco.com> <4781FFEA.8050800@juniper.net> <4782170D.2040200@cisco.com> <478252F3.4000809@juniper.net> <4782D04B.50703@gmail.com> <47832402.9090001@cisco.com> <47833D03.7050005@juniper.net> <4783E670.5080003@gmail.com> <4783EA14.2030900@juniper.net> <4783F0B8.3070401@gmail.com> <4783F4A8.2000506@juniper.net> <792E56AD-25FB-4942-B649-FAF12DBEFD77@muada.com>
In-Reply-To: <792E56AD-25FB-4942-B649-FAF12DBEFD77@muada.com>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
X-OriginalArrivalTime: 09 Jan 2008 13:02:54.0173 (UTC) FILETIME=[F28EB8D0:01C852BF]
X-Spam-Score: -4.0 (----)
X-Scan-Signature: b132cb3ed2d4be2017585bf6859e1ede
Cc: idr <idr@ietf.org>
Subject: [Idr] Re: BGP issues
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.5
Precedence: list
Reply-To: raszuk@juniper.net
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www1.ietf.org/pipermail/idr>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
Errors-To: idr-bounces@ietf.org

Hi Iljitsch,

As we are discussing BGP i am redirecting this to idr list for IDR WG to 
comment or work on.

>> But I would like to also see the real list of what are the issue in 
>> BGP we are trying to address ... Is this point-to-point model, is this 
>> TCP transport, is this update format ... No one yet stated what is 
>> broken. And when we do not know what is broken .. or which link of BGP 
>> chain is fragile it is quite a challenge to design new thing around 
>> the unknown.
> In addition to the fact that it's possible to configure iBGP such that 
> you can have routing loops?

Could you provide an example ?

> it suffers from some of the problems inherent to distance vector
> routing, such as slow convergence

BGP can converge today in subseconds. More over BGP routes with correct 
implementation may converge in prefix independent manner. It is not that 
BGP is slow. The vast majority of end to end convergence is a product of 
failure localization, failure detection time and failure propagation 
time. BGP can be made even today to propagate more then one path within 
given AS when best external functionality is in place. Ask your 
favourite vendor for details. More over there are also proposals 
discussed at IDR to send more then best path in BGP.

 > Another problem area for
> BGP is the fact that all processing happens on a per-prefix basis: there
> is no way to communicate reachability or policy changes except to update
> all impacted prefixes. 

Not true. There is proposal for BGP aggregate withdraw. Prefixes can be 
aggregated itself by it's mask. There ways to distribute policies 
without attaching them to the prefixes (example: rt-constrain).

> BGP is extremely agnostic as to the underlying
> path selection algorithm in order to accommodate as much policy control
> as possible.

Accommodating policies is not a BGP protocol requirement but it comes 
from operators using this protocol to fit their operational needs.

> Unfortunately, this makes it very hard to predict BGP's
> behavior and the default behavior (especially with today's rather flat
> AS hierarchy) is more often than not suboptimal. BGP allows harmful
> policies that keep the protocol from converging to a stable state.

Again this is not BGP which allows to configure harmful policies. 
Policies can be harmful indeed as there is no mechanism or in any way 
control on them. If in IGP you block LSA flooding on all links of the 
other routers in IGP peers would you call this an OSPF fault that your 
router is somehow not seen by the rest of them ?

 > Lack
> of workable aggregation mechanisms means that once an address block is
> deaggregated, it's almost impossible to get rid of the resulting long
> prefixes, leading to excessive growth of the internet's global routing
> table. 

Correct. I think we agree that scalable multi-homing should be solved. 
Perhaps a way to do that is to use PI address spaces, perhaps shim6, or 
berhaps just to decouple sending all in one flat BGP as is today. 
Protocol IMHO is ready for 2 tier hierarchy with no protocol changes 

> Coarseness of the only available end-to-end metric (the AS path)
> pushes operators to deaggregation for traffic engineering purposes. 

What other metrics would you recommend to add to BGP and how that would 
that reduce deaggregation ?

> The
> way BGP operates within a single AS requires an additional intra-domain
> routing protocol 

IMHO this is a feature not a bug. In fact for convergence reasons this 
is a very useful feature to have fast IGP.

> and suboptimal engineering tradeoffs by requiring
> having a full mesh between all BGP routers within the AS or having route
> reflectors or a confederation. 

That problem has been solved today. Within any domain you can run 
encpasulation of your choice between ASBRs and non of the routers in the 
core need to have any BGP routes.

> There is no validation of routing
> information beyond the next hop. 

Are you alluding to S-BGP/so-BGP proposals ? If not what other 
validation would be required ?

> A BGP speaker only communicates its
> best path (if any) to a neighbor, with no way to tie additional
> information to the nonexistence of a path and no way to accomplish type
> of service routing or install backup paths. 

Discussed that one above.

 > Paths must be explicitly
> revoked, which in practice requires a BGP speaker to keep track of which
> paths were communicated to which peer. 

Not today. Today you revoke the prefix not the path. When the ability to 
send more then best path is added true .. paths would have to be 
explicitly revoked. Any other proposal in RRG does the very same. 
Readvertisement of EID with new list of locators is an explicit 

> BGP requires fairly extensive
> configuration (setting up filters) before it's useful.

It could be just one line if you like to send/accept everything to/from 
your peer.


Idr mailing list