Re: BGP-4 - revised I-D

Curtis Villamizar <curtis@ans.net> Wed, 21 August 1996 02:03 UTC

Received: from ietf.org by ietf.org id aa29499; 20 Aug 96 22:03 EDT
Received: from cnri by ietf.org id aa29495; 20 Aug 96 22:03 EDT
Received: from merit.edu by CNRI.Reston.VA.US id aa17988; 20 Aug 96 22:03 EDT
Received: (from daemon@localhost) by merit.edu (8.7.5/merit-2.0) id VAA15143 for idr-outgoing; Tue, 20 Aug 1996 21:25:28 -0400 (EDT)
Received: from interlock.ans.net (interlock.ans.net [147.225.5.5]) by merit.edu (8.7.5/merit-2.0) with SMTP id VAA15138 for <bgp@merit.edu>; Tue, 20 Aug 1996 21:25:25 -0400 (EDT)
Received: by interlock.ans.net id AA14479 (InterLock SMTP Gateway 3.0 for bgp@ans.net); Tue, 20 Aug 1996 21:25:21 -0400
Received: by interlock.ans.net (Internal Mail Agent-1); Tue, 20 Aug 1996 21:25:21 -0400
Message-Id: <199608210124.VAA19011@brookfield.ans.net>
To: bgp@ans.net
Cc: tli@jnx.com
Reply-To: curtis@ans.net
Subject: Re: BGP-4 - revised I-D
In-Reply-To: Your message of "Fri, 09 Aug 1996 13:29:07 PDT." <199608092032.NAA00931@hubbub.cisco.com>
Date: Tue, 20 Aug 1996 21:24:02 -0400
Sender: ietf-archive-request@ietf.org
From: Curtis Villamizar <curtis@ans.net>
X-Orig-Sender: owner-idr@merit.edu
Precedence: bulk

In message <199608092032.NAA00931@hubbub.cisco.com>om>, Yakov Rekhter writes:
> Folks,
> 
> Today we (Tony and myself) submitted a revised version
> of BGP-4 I-D. The only two essential changes are in
> Section 9.1.2.1 clause (a), and Section 9.2.1.1 clause (a).
> The changes reflect the agreement we reached at the
> last IDR WG meeting.
> 
> Yakov.


Yakov,

BGP4 is great stuff but for it to advance, we really should clean up
the document to make it harder to come up with implementations that
are compatible with the BGP4 RFC but don't interoperate well enough
with other implementations to insure loop free operation.

We gained a lot of experience with gated, Cisco, and Bay
implementations of BGP4.  Subtle differences in the route selection
rules of each can cause route loops, in some cases in single vendor
environments, but mostly when mixing implementations.

The changes in Section 9.1.2.1 clause (a), and Section 9.2.1.1 clause
(a) take a step toward addressing this but don't do a complete job.

Please consider the problems described below and the suggested
changes.

Regards,

Curtis



First, here are conditions that can cause routing loops.  These all
involves differences in route selection criteria or route selection
criteria that can cause routing loops if routes are evaluated in
different orders on different routers in a network.

   a. We haven't seen anyone lose the significance of local-preference
      as the primary selection criteria.  I'd like to see the spec
      absolutely clear about this.

   b. We have seen routing loops form as a result of some routers
      using AS-path length in the decision criteria.  If this is not
      standard behavior, then the spec needs to say that.  If this is
      useful (it is) but not standard behavior, the spec needs to
      allow it as an option.

   c. Differences in interpretation of a missing MED and poor choice
      of behavior covering a missing MED.  Without going into details
      (see below) preferring a route with no MED over one with a
      MED can cause route loops.  Not considering MED at all in
      comparing routes with MED to routes without can cause route
      loops.  This is addresses by the changes you've made.

   d. Without going into details (see below) not considering MED
      when comparing routes from different AS can cause routing loops.
      This is not addressed by the changes you've made.  BGP4 needs to
      explicitly require considering MED when comparing routes from
      different AS.

Details of route loops in a) and b) should be obvious.

For b), consider:

	R1  --2--  R2  --1--  R3  --4-- R4

R1 announces a route with AS path length 2, R4 announces the same
route with AS path length 1.  R2 considers AS path length and picks
R4, R3 doesn't consider AS path length and picks R1.  R2 and R3 point
next hops at each other.

Details of route loops in c) and d) are provided here.  

If there is ever a case where the comparison of three or more routes
is not a transitive operation, the order that routes appear on
interior routers will affect the outcome.  If routes arrive at
interior routers in different orders, which is common with IBGP or
with EBGP redistributed using IGP external routes, a routing loop can
form.  Examples are given below.  This has been observed in practice.

Refer to the diagram below where R{1-5} are routers and the numbers on
the lines are IGP link costs.

    R1  --6--  R4  --1--  R5  --4--  R3
    R2  --2-/

   c.	preferring a route with no MED

		MED	IGP		
	R1	 -	6-7		R2 is better than R1 (lower IGP)
	R2	 -	2-3		R3 is better than R2 (no MED)
	R3	 2	4-5		R1 is better than R3 (no MED)

	R4 learns R1, R2, R3.  It prefers R3.
	R5 learns R1, R3, R2.  It prefers R2.  (R2 announcement delayed)
	R4 and R5 point next hop at each other.

	not considering MED comparing routes with MED to routes without

		MED	IGP		
	R3	 -	4-5		R2 is better than R3 (lower IGP)
	R2	 1	2-3		R1 is better than R2 (lower MED)
	R1	 2	6-7		R3 is better than R1 (lower IGP)

	R4 learns R1, R2, R3.  It prefers R3.  (R3 announcement delayed)
	R5 learns R3, R1, R2.  It prefers R2.
	R4 and R5 point next hop at each other.

   d.	not considering MED when comparing routes from different AS

	(same as prior example)

	    AS	MED	IGP		
	R3   x	 1	4-5		R2 is better than R3 (lower IGP)
	R2   y	 2	2-3		R1 is better than R2 (lower MED)
	R1   y	 3	6-7		R3 is better than R1 (lower IGP)

	R4 learns R1, R2, R3.  It prefers R3.  (R3 announcement delayed)
	R5 learns R3, R1, R2.  It prefers R2.
	R4 and R5 point next hop at each other.

With that out of the way (hopefully), we can go about fixing BGP4 so
route loops can't form between any two implementation that adhere to
the BGP4 RFC.

In 9.1.1:

   ...

   For each newly received or replacement feasible route, the local BGP
   speaker shall determine a degree of preference. If the route is
   learned from a BGP speaker in the local autonomous system, either the
   value of the LOCAL_PREF attribute shall be taken as the degree of
   preference, or the local system shall compute the degree of
   preference of the route based on preconfigured policy information.

+  A BGP4 implementation may consider AS_PATH length after LOCAL_PREF.
+  Shorter AS_PATH length may be preferred over longer AS_PATH lengths
+  as an option.  All implementations MUST have the ability to disable
+  this option.  The length shall be considered to be the sum of the
+  number of AS in each AS_SEQUENCE plus the number of AS_SET in the
+  AS_PATH.  Note that each AS_SET contributes one to the computed
+  sum regardless of how many AS are in the AS_SET and each
+  AS_SEQUENCE contributes the number of AS in the AS_SEQUENCE to the
+  sum.

In 9.1.2 (back reference for improved clarity):

   For each set of destinations for which a feasible route exists in the
   Adj-RIBs-In, the local BGP speaker shall identify the route that has:

!  a) the highest degree of preference of any route using the criteria
!     described in 9.1.1, or

In 9.1.2.1:

      a) If the local system is configured to take into account
      MULTI_EXIT_DISC, and the candidate routes differ in their
      MULTI_EXIT_DISC attribute, select the route that has the lowest
      value of the MULTI_EXIT_DISC attribute.  A route with
      MULTI_EXIT_DISC shall be preferred to a route without
!     MULTI_EXIT_DIST.  MULTI_EXIT_DISC shall be compared regardless
+     of the AS that the candidate routes were heard from and
+     regardless of any difference in the AS_PATH.

	...

         - otherwise, select the route that was advertised by the BGP
         speaker whose BGP Identifier has the lowest value.

+  Any deviation from the selection criteria described here can be
+  implemented as long as the deviation is optional.  An example is
+  the AS_PATH length option described in 9.1.1.1.  There MUST be a
+  way to disable any supported deviation from the selection criteria.
+  Any deviation MUST be clearly indicated as such.  It SHOULD be as
+  easy as possible to select the standard BGP4 route selection.