Re: [Idr] draft-uttaro-idr-bgp-persistence-00

<bruno.decraene@orange.com> Wed, 02 November 2011 16:53 UTC

Return-Path: <bruno.decraene@orange.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BD82211E80B3 for <idr@ietfa.amsl.com>; Wed, 2 Nov 2011 09:53:12 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.649
X-Spam-Level:
X-Spam-Status: No, score=-2.649 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599, HELO_EQ_FR=0.35, J_CHICKENPOX_13=0.6, RCVD_IN_DNSWL_LOW=-1]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 6Nr9XPuxdZiQ for <idr@ietfa.amsl.com>; Wed, 2 Nov 2011 09:53:11 -0700 (PDT)
Received: from p-mail1.rd.francetelecom.com (p-mail1.rd.francetelecom.com [195.101.245.15]) by ietfa.amsl.com (Postfix) with ESMTP id DC5FB11E8090 for <idr@ietf.org>; Wed, 2 Nov 2011 09:53:10 -0700 (PDT)
Received: from p-mail1.rd.francetelecom.com (localhost.localdomain [127.0.0.1]) by localhost (Postfix) with SMTP id DEEDD9B0007; Wed, 2 Nov 2011 17:54:08 +0100 (CET)
Received: from ftrdsmtp1.rd.francetelecom.fr (unknown [10.192.128.46]) by p-mail1.rd.francetelecom.com (Postfix) with ESMTP id 4ADDF9B0005; Wed, 2 Nov 2011 17:52:13 +0100 (CET)
Received: from ftrdmel0.rd.francetelecom.fr ([10.192.128.56]) by ftrdsmtp1.rd.francetelecom.fr with Microsoft SMTPSVC(6.0.3790.4675); Wed, 2 Nov 2011 17:51:09 +0100
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
Date: Wed, 02 Nov 2011 17:51:08 +0100
Message-ID: <FE8F6A65A433A744964C65B6EDFDC24002951FA6@ftrdmel0.rd.francetelecom.fr>
In-Reply-To: <4EB157DA.8010902@raszuk.net>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [Idr] draft-uttaro-idr-bgp-persistence-00
Thread-Index: AcyZbkuY+T5FKov2SV2ZEFAERnozggADyaJA
References: <4EA1F0FB.3090100@raszuk.net> <4EA487E4.2040201@raszuk.net><B17A6910EEDD1F45980687268941550FA20750@MISOUT7MSGUSR9I.ITServices.sbc.com><4EA84254.9000400@raszuk.net> <4EA8A91C.4090305@cisco.com> <FE8F6A65A433A744964C65B6EDFDC24002951EBF@ftrdmel0.rd.francetelecom.fr> <4EB157DA.8010902@raszuk.net>
From: bruno.decraene@orange.com
To: robert@raszuk.net
X-OriginalArrivalTime: 02 Nov 2011 16:51:09.0653 (UTC) FILETIME=[9F20E050:01CC997F]
Cc: idr@ietf.org
Subject: Re: [Idr] draft-uttaro-idr-bgp-persistence-00
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/idr>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 02 Nov 2011 16:53:12 -0000

Robert,


>Bruno,
>
> > Indeed, possibly the persistence requirements could be inserted in the
> > GR draft.
>
>Really ? I find this totally opposite.
>
>GR can be very useful without persistence draft.

That was mostly my point. Sorry if I didn't make it clear.

> But how persistent
>draft be of any use without assuring even at the given router proper
>forwarding ?

For an eBGP session, I agree forwarding should be more or less trusted as running. (even if this is not strictly required given all routers will select the other path. Cf my previous email)

>So correct me if I am not mistaken. Imagine following sequence of events:
>
>- PE operates fine GR is not enable and persistence is
>- PE looses session to RRs as RRs goes down
>- PE keeps all routes in control plane and data plane (no GR involved)
>- PE re-advertises all routes (for given AFI/SAFI) to all 1000 of CEs
>   marked as STALE
>- PE re-establishes IBGP sessions with RRs and effectively performs GR
>   invalid path removal
>
>
>Is this correct ?

No.
Interaction between GR and persistence is currently not described in this version. It will be in the next version.
In short, if GR & persistence were enabled both on a BGP session, GR would first run. Then if GR does not succeed, persistence would come into play.

>If so then end result is double churn to 1000 CEs and reuse of the main
>GR re-sync specification without enabling GR.

No.
If GR success, there would be no churn.
Otherwise, single churn

>Moreover with this you are relaxing the need for any capability
>negotiation. Perhaps GR authors could clarify why did we need in the
>first place per SAFI GR capability negotiation. Going by your draft just
>global GR / EOR marker would be sufficient. 

Indeed, no capability negotiation is required. We only need EOR. In v00, it has been assumed that EOR is now a standard behavior. Otherwise we could explicitly signal this using the GR capability.

>I don't recall any
>discussion on that point in persistent draft too.

" A speaker
   configures the ability to persist independently of it's peer.  There
   is no negotiation between the peers.  A timer must be configured
   indicating the time to persist stale state from a peer where the
   session is no longer viable. "

§3 http://tools.ietf.org/html/draft-uttaro-idr-bgp-persistence-00#page-6


"   o  The Receiving Speaker MUST replace the stale routes by the routing
      updates received from the peer.  Once the End-of-RIB marker for an
      address family is received from the peer, it MUST immediately
      remove any paths from the peer that are still marked as stale for
      that address family."

§4.2 http://tools.ietf.org/html/draft-uttaro-idr-bgp-persistence-00#section-4.2

Cheers,
Bruno

>Cheers,
>R.
>
>> Hi Enke,
>>
>>> From: Enke Chen; Sent: Thursday, October 27, 2011 2:43 AM
>>>
>>> Hi, folks:
>>>
>>> I have a hard time in understanding what new problems (beyond the GR)
>>> the draft try to solve :-(
>>
>> That's a good comment. Do you think the draft should put both GR and
>> persistence into perspective? (e.g. either in the introduction or in
>> appendix)
>>
>>> If the concern is about the simultaneous RR failure as shown in the
>>> examples in Sect. 6 Application, that can be addressed easily using GR.
>>> As the RRs are not in the forwarding path, it means that the forwarding
>>> is not impacted (thus is preserved) during the restart of a RR.   The
>>> Forwarding State bit (F) in the GR capability should always be set by
>>> the RR when it is not in the forwarding path.
>>
>> GR has limitations: limited duration, does not handle consecutive
>> session restarts, (currently) does not handle BGP failures, does not
>> advertise that the STALE path may be less trusted.
>>
>>> Also in the case of simultaneous RR failure, I do not see why one would
>>> want to retain some routes, but not others, using the communities
>>> specified in the draft.  As the RRs are not in the forwarding path,
>>> wouldn't be better to retain all the routes on a PE/client?
>>
>> I find interesting that you considered that all routes would be kept
>> STALE for a long time. I was rather anticipating that some people would
>> be uncomfortable with this, especially in the VPN cases as correct and
>> up to date routing information are required to ensure isolation between
>> VPNs. (cf security section of the persistence draft).
>> Regarding your point, some route could be fairly static and hence kept
>> STALE for a long time. Other routes may be more dynamic and more
>> problematic to keep STALE for a long time. Hence the ability for a
>> different response. Also, some route may be more critical than others.
>>
>>> As you might be aware, efforts have been underway to address issues
>> with
>>> GR found during implementation and deployment. They include the spec
>>> respin, notification handling, and implementations.  If there are
>> issues
>>> in the GR area that are not adequately addressed,  I suggest that we
>> try
>>> to address them in the GR respin if possible, instead of creating
>>> another variation unnecessarily.
>>
>> Indeed, possibly the persistence requirements could be inserted in the
>> GR draft. But how much are you / IDR WG ready to change that much the GR
>> procedures? That would basically force all GR implementations to
>> implement the persistence behavior.
>>
>> Besides, IMHO, GR and persistence make different assumptions and hence
>> are rather different:
>> - GR assumes that the forwarding is preserved during the BGP session
>> restart and hence decides to not advertise this event in the network.
>> For such assumption, you need to believe pretty strongly that the
>> forwarding is indeed preserved. IMHO this calls for a short timer
>> duration and quick fall back to session failure if you suspect this may
>> not be the case (e.g. in case of consecutive restart of the BGP session,
>> GR is aborted).
>> - "persistence" considers that in case of issues, some stale information
>> is better than no information, but worst than correct information. The
>> downside is BGP churn to advertise the event. The good side is that this
>> is a safer assumption hence more likely to be valid over time or events.
>>
>> Regards,
>> Bruno
>>
>>>
>>> Thanks.   -- Enke
>>>
>>>
>>> On 10/26/11 10:24 AM, Robert Raszuk wrote:
>>>> Jim,
>>>>
>>>> When one during design phase of a routing protocol or routing
>> protocol
>>>> extension or modification to it already realizes that enabling such
>>>> feature may cause real network issue if not done carefully - that
>>>> should trigger the alarm to rethink the solution and explore
>>>> alternative approaches to the problem space.
>>>>
>>>> We as operators have already hard time to relate enabling a feature
>>>> within our intradomain boundaries to make sure such rollout is
>> network
>>>> wide. Here you are asking for the same level of awareness across ebgp
>>>> boundaries. This is practically unrealistic IMHO.
>>>>
>>>> Back to the proposal ... I think that if anything needs to be done is
>>>> to employ per prefix GR with longer and locally configurable timer.
>>>> That would address information persistence across direct IBGP
>> sessions.
>>>>
>>>> On the RRs use case of this draft we may perhaps agree to disagree,
>>>> but I do not see large enough probability of correctly engineered RR
>>>> plane to experience simultaneous multiple ibgp session drops. If that
>>>> happens the RR placement, platforms or deployment model should be
>>>> re-engineered.
>>>>
>>>> Summary .. I do not think that IDR WG should adopt this document.
>> Just
>>>> adding a warning to the deployment section is not sufficient.
>>>>
>>>> Best regards,
>>>> R.
>>>>
>>>>
>>>>> Robert,
>>>>>
>>>>> The introduction of this technology needs to be carefully evaluated
>>>>> when being deployed into the network. Your example clearly calls out
>>>>> how a series of independent design can culminate in incorrect
>>>>> behavior. Certainly the deployment of persistence on a router that
>>>>> has interaction with a router that does not needs to be clearly
>>>>> understood by the network designer. The goal of this draft is to
>>>>> provide a fairly sophisticated tool that will protect the majority
>> of
>>>>> customers in the event of a catastrophic failure.. The premise being
>>>>> the perfect is not the enemy of the good.. I will add text in the
>>>>> deployment considerations section to better articulate that..
>>>>>
>>>>> Thanks, Jim Uttaro
>>>>>
>>>>> -----Original Message----- From: idr-bounces@ietf.org
>>>>> [mailto:idr-bounces@ietf.org] On Behalf Of Robert Raszuk Sent:
>>>>> Sunday, October 23, 2011 5:32 PM To: idr@ietf.org List Subject:
>> [Idr]
>>>>> draft-uttaro-idr-bgp-persistence-00
>>>>>
>>>>> Authors,
>>>>>
>>>>> Actually when discussing this draft a new concern surfaced which I
>>>>> would like to get your answer on.
>>>>>
>>>>> The draft in section 4.2 says as one of the forwarding rules:
>>>>>
>>>>> o  Forwarding to a "stale" route is only used if there are no other
>>>>> paths available to that route.  In other words an active path always
>>>>> wins regardless of path selection.  "Stale" state is always
>>>>> considered to be less preferred when compared with an active path.
>>>>>
>>>>> In the light of the above rule let's consider a very simple case of
>>>>> dual PE attached site of L3VPN service. Two CEs would inject into
>>>>> their IBGP mesh routes to the remote destination: one marked as
>> STALE
>>>>> and  one not marked at all. (Each CE is connected to different PE
>> and
>>>>> each PE RT imports only a single route to a remote hub headquarter
>> to
>>>>> support geographic load balancing).
>>>>>
>>>>> Let me illustrate:
>>>>>
>>>>> VPN Customer HUB
>>>>>
>>>>> PE3      PE4 SP PE1      PE2 |        | |        | CE1      CE2 |
>>>>> | 1|        |10 |        | R1 ------ R2 1
>>>>>
>>>>> CE1,CE2,R1,R2 are in IBGP mesh. IGP metric of CE1-R1 and R1-R2 are 1
>>>>> and R2-CE2 is 10.
>>>>>
>>>>> Prefix X is advertised by remote hub in the given VPN such that PE1
>>>>> vrf towards CE1 only has X via PE3 and PE2's vrf towards CE2 only
>> has
>>>>> X via PE4.
>>>>>
>>>>> Let's assume EBGP sessions PE3 to HUB went down, but ethernet link
>>>>> is up, next hop is in the RIB while data plane is gone. Assume no
>>>>> data plane real validation too. /* That is why in my former message
>>>>> I suggested that data plane validation would be necessary */.
>>>>>
>>>>> R1 has X via PE1/S (stale) and X via PE2/A (active) - it understands
>>>>> STALE so selects in his forwarding table path via CE2.
>>>>>
>>>>> R2 has X via PE1/S (stale) and X via PE2/A (active) - it does not
>>>>> understand STALE, never was upgraded to support the forwarding rule
>>>>> stated above in the draft and chooses X via CE1 (NH metric 2 vs 10).
>>>>>
>>>>> R1--R2 produce data plane loop as long as STALE paths are present in
>>>>> the system. Quite fun to troubleshoot too as the issue of PE3
>>>>> injecting such STALE paths may be on the opposite site of the world.
>>>>>
>>>>> The issue occurs when some routers within the customer site will be
>>>>> able to recognize STALE transitive community and prefer non stale
>>>>> paths in their forwarding planes (or BGP planes for that matter)
>>>>> while others will not as well as when both stale and non stale paths
>>>>> will be present.
>>>>>
>>>>> Question 1: How do you prevent forwarding loop in such case ?
>>>>>
>>>>> Question 2: How do you prevent forwarding loop in the case when
>>>>> customer would have backup connectivity to his sites or connectivity
>>>>> via different VPN provider yet routers in his site only partially
>>>>> understand the STALE community and only partially follow the
>>>>> forwarding rules ?
>>>>>
>>>>> In general as the rule is about mandating some particular order of
>>>>> path forwarding selection what is the mechanism in distributed
>>>>> systems like today's routing to be able to achieve any assurance
>> that
>>>>> such rule is active and enforced across _all_ routers behind EBGP
>>>>> PE-CE L3VPN boundaries in customer sites ?
>>>>>
>>>>> Best regards, R.
>>>>>
>>>>>
>>>>> -------- Original Message -------- Subject: [Idr]
>>>>> draft-uttaro-idr-bgp-persistence-00 Date: Sat, 22 Oct 2011 00:23:55
>>>>> +0200 From: Robert Raszuk<robert@raszuk.net>  Reply-To:
>>>>> robert@raszuk.net To: idr@ietf.org List<idr@ietf.org>
>>>>>
>>>>> Hi,
>>>>>
>>>>> I have read the draft and have one question and one observation.
>>>>>
>>>>> Question:
>>>>>
>>>>> What is the point of defining DO_NOT_PERSIST community ? In other
>>>>> words why not having PERSIST community set would not mean the same
>> as
>>>>> having path marked with DO_NOT_PERSIST.
>>>>>
>>>>> Observation:
>>>>>
>>>>> I found the below statement in section 4.2:
>>>>>
>>>>> o  Forwarding must ensure that the Next Hop to a "stale" route is
>>>>> viable.
>>>>>
>>>>> Of course I agree. But since we stating obvious in the forwarding
>>>>> section I think it would be good to explicitly also state this in
>>>>> the best path selection that next hop to STALE best path must be
>>>>> valid.
>>>>>
>>>>> However sessions especially those between loopbacks do not go down
>>>>> for no reason. Most likely there is network problem which may have
>>>>> caused those sessions to go down. It is therefor likely that LDP
>>>>> session went also down between any of the LSRs in the data path and
>>>>> that in spite of having the paths in BGP and next hops in IGP the
>> LSP
>>>>> required for both quoted L2/L3VPN applications is broken. That may
>>>>> particularly happen when network chooses to use independent control
>>>>> mode for label allocation.
>>>>>
>>>>> I would suggest to at least add the recommendation statement to the
>>>>> document that during best path selection especially for stale paths
>>>>> a validity of required forwarding paradigm to next hop of stale
>>>>> paths should be verified.
>>>>>
>>>>> For example using techniques as described in:
>>>>> draft-ietf-idr-bgp-bestpath-selection-criteria
>>>>>
>>>>> Best regards, R.
>>>>>
>