bgp bestpath conquer
mate csaba <matecs@niif.hu> Mon, 19 October 2015 20:50 UTC
Return-Path: <matecs@niif.hu>
X-Original-To: routing-discussion@ietfa.amsl.com
Delivered-To: routing-discussion@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D6B4E1ACDDF for <routing-discussion@ietfa.amsl.com>; Mon, 19 Oct 2015 13:50:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: 2.594
X-Spam-Level: **
X-Spam-Status: No, score=2.594 tagged_above=-999 required=5 tests=[BAYES_20=-0.001, HELO_EQ_HU=1.35, HOST_EQ_HU=1.245] autolearn=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XGocLrKZ_am2 for <routing-discussion@ietfa.amsl.com>; Mon, 19 Oct 2015 13:50:55 -0700 (PDT)
Received: from strudel.ki.iif.hu (strudel.ki.iif.hu [IPv6:2001:738:0:411:20f:1fff:fe6e:ec1e]) by ietfa.amsl.com (Postfix) with ESMTP id 6C2781AC445 for <routing-discussion@ietf.org>; Mon, 19 Oct 2015 13:50:53 -0700 (PDT)
Received: from bolha.lvs.iif.hu (bolha.lvs.iif.hu [193.225.14.181]) by strudel.ki.iif.hu (Postfix) with ESMTP id C39A0979; Mon, 19 Oct 2015 22:50:51 +0200 (CEST)
X-Virus-Scanned: Debian amavisd-new at bolha.lvs.iif.hu
Received: from strudel.ki.iif.hu ([IPv6:::ffff:193.6.222.244]) by bolha.lvs.iif.hu (bolha.lvs.iif.hu [::ffff:193.225.14.72]) (amavisd-new, port 10024) with ESMTP id ju9YJZdH0-6Y; Mon, 19 Oct 2015 22:50:50 +0200 (CEST)
Received: from [IPv6:2001:db8:21:0:221a:6ff:fe6d:1304] (unknown [IPv6:2001:470:25:28f::dead:beef]) by strudel.ki.iif.hu (Postfix) with ESMTPSA id F2BB98FC; Mon, 19 Oct 2015 22:50:49 +0200 (CEST)
From: mate csaba <matecs@niif.hu>
Subject: bgp bestpath conquer
To: routing-discussion@ietf.org
Message-ID: <5625577F.9070106@niif.hu>
Date: Mon, 19 Oct 2015 22:50:07 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Icedove/38.3.0
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 7bit
Archived-At: <http://mailarchive.ietf.org/arch/msg/routing-discussion/HahM-AyIW-oED3Mfe_Th_qAMeIw>
Cc: nep@listserv.niif.hu
X-BeenThere: routing-discussion@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Routing Area General mailing list <routing-discussion.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/routing-discussion>, <mailto:routing-discussion-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/routing-discussion/>
List-Post: <mailto:routing-discussion@ietf.org>
List-Help: <mailto:routing-discussion-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/routing-discussion>, <mailto:routing-discussion-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 19 Oct 2015 20:50:57 -0000
hi, we run a network with 3 route reflectors. the third one is for own and customer prefixes. it has 2k routes, the other two rrs have full table. the pes advertise bestexternal, so rrs have full visibility. our policy dictates that all the routes have locpref set to express active/backup. it's observed multiple times that the small rr sends out the updates much faster than full table rrs. when a customer changes from active to backup, the active pe sends the withdraw to the rrs, the rrs select the new bestpath, and floods it. but in this case, the locpref changes from high to low, and any pe that wants to forward for this prefix, have to get the update from all the rrs, because until one high locpref exists, that will be selected locally. the idea is that what if i change the small rr to do the following: when it detectes a nexthop change during bestpath calculation, take the old bestpath's locpref, increment it by 1, and send out the new bestpath with this incremented locpref. if another nexthop change detected for this prefix, increment the locpref again. so every time it computes the bestpath for this prefix, it'll send out the result with an increasing locpref, forcing pes to instantly start using the new path. optionally, a minute timer could expire the prefix, then it'll send out the prefix with the received locpref, because for that time, other rrs completed their normal flooding process, but it would double bgp traffic. it would speed up convergence in the active->backup case because we don't have to wait for all the rrs to finish it's work. only drawbacks i see now are the following: -it requres local hot potato routing at backup pe to work. -maximum one conqueror rr must be used within a single cluster. -when active route flaps, the locpref will count to 2^31. -it disrupts igp metric usage in case of scattered rrs or addpath. the question is, that do you see something else? any feedback welcomed! thanks, csaba mate niif/as1955
- bgp bestpath conquer mate csaba