Re: [Idr] draft-uttaro-idr-bgp-persistence-00

Robert Raszuk <robert@raszuk.net> Wed, 02 November 2011 13:39 UTC

Return-Path: <robert@raszuk.net>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 23F6311E809B for <idr@ietfa.amsl.com>; Wed, 2 Nov 2011 06:39:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.599
X-Spam-Level:
X-Spam-Status: No, score=-2.599 tagged_above=-999 required=5 tests=[BAYES_00=-2.599]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5mw7b1T7Krqw for <idr@ietfa.amsl.com>; Wed, 2 Nov 2011 06:39:17 -0700 (PDT)
Received: from mail37.opentransfer.com (mail37.opentransfer.com [76.162.254.37]) by ietfa.amsl.com (Postfix) with SMTP id 38A7A11E8090 for <idr@ietf.org>; Wed, 2 Nov 2011 06:39:17 -0700 (PDT)
Received: (qmail 16736 invoked by uid 399); 2 Nov 2011 13:39:16 -0000
Received: from unknown (HELO ?192.168.1.52?) (83.31.220.67) by mail37.opentransfer.com with SMTP; 2 Nov 2011 13:39:16 -0000
Message-ID: <4EB147F6.1090906@raszuk.net>
Date: Wed, 02 Nov 2011 14:39:02 +0100
From: Robert Raszuk <robert@raszuk.net>
User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:7.0.1) Gecko/20110929 Thunderbird/7.0.1
MIME-Version: 1.0
To: bruno.decraene@orange.com
References: <4EA1F0FB.3090100@raszuk.net> <4EA487E4.2040201@raszuk.net> <FE8F6A65A433A744964C65B6EDFDC24002951E19@ftrdmel0.rd.francetelecom.fr>
In-Reply-To: <FE8F6A65A433A744964C65B6EDFDC24002951E19@ftrdmel0.rd.francetelecom.fr>
Content-Type: text/plain; charset="ISO-8859-1"; format="flowed"
Content-Transfer-Encoding: 7bit
Cc: idr@ietf.org
Subject: Re: [Idr] draft-uttaro-idr-bgp-persistence-00
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
Reply-To: robert@raszuk.net
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/idr>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 02 Nov 2011 13:39:18 -0000

Hi Bruno,

 > Nope. In your above case, I would expect the failure to be concealed
 > within the SP AS:
 >
 > PE1 has X via PE1/S (stale) and X via PE2/A (active) - it understands
 > STALE so selects in his forwarding table path via PE2.
 >
 > --> The customer does not see the "STALE" information.

Nope - it sure does.

As I have obviously foreseen you are going to say so I specifically 
pointed out that there is no symmetry on the PEs due to the specific RT 
configuration. See below quote:

"Prefix X is advertised by remote hub in the given VPN such that PE1 vrf
towards CE1 only has X via PE3 and PE2's vrf towards CE2 only has X via
PE4."

> You bring the issue of the incremental deployment of a feature which
> change the (BGP) routing decision. That's a valid point (not necessarily
> specific to the persistence draft).

I think any draft modifying routing decision should discuss such issues.

> One answer is careful deployment of such features.

I am afraid this is very difficult to achieve across EBGP boundaries. If 
you would just limit STALE to IBGP I would perhaps have a bit lighter 
objection :)

> However I the persistence draft could probably be improved to handles
> incremental deployment. This could be addressed in the next revision.
> A first tentative could be:
> - over an iBGP session, the router detecting the BGP session failure,
> flag the routes with a low (e.g. 0) LOCAL_PREF plus the STALE community.
> Within the AS, the STALE community does not influence routing decision
> but inform possible downstream BGP routers of the cause of the low
> local_pref. In particular that the LOCAL_PREF should probably not be
> further changed (some AS handles load balancing by rewriting the Local
> Pref on the ingress PE)
> - inform downstream AS of the STALE condition. First eBGP router could
> translate this community into a low local_pref. With the iBGP mesh, cf
> above.

Translating STALE to Local Pref over EBGP and not using STALE for local 
decision makes the proposal much less dangerous indeed.

Is any action associated with STALE going to be disabled by default in 
all BGP implementations ? If yes I would see it acceptable - provided 
the above changes are introduced.

However if any implementation is going to enable auto translation of 
STALE to local pref = 0 across EBGP boundary, or make best path decision 
based on STALE presence -  we have the same problem as soon as any two 
routers are not upgraded in the same time to the same vendor's code 
version.

--

Regardless how novel and appealing is the idea of telling your customer 
that you can possibly forward, but your control plane is undergoing 
recovery - I still think that churn introduced by this sort of 
signalling on a per path basis is a pretty bad idea. If anything I would 
rather see this in the BGP Operational Message not in the actual routing 
UPDATE messages.

Cheers,
R.