Re: IDR WG agenda

Curtis Villamizar <curtis@ans.net> Fri, 28 February 1997 20:17 UTC

Received: from cnri by ietf.org id aa05322; 28 Feb 97 15:17 EST
Received: from merit.edu by CNRI.Reston.VA.US id aa18292; 28 Feb 97 15:17 EST
Received: (from daemon@localhost) by merit.edu (8.8.5/merit-2.0) id OAA05877 for idr-outgoing; Fri, 28 Feb 1997 14:28:48 -0500 (EST)
Received: from interlock.ans.net (interlock.ans.net [147.225.5.5]) by merit.edu (8.8.5/merit-2.0) with SMTP id OAA05869 for <bgp@merit.edu>; Fri, 28 Feb 1997 14:28:44 -0500 (EST)
Received: by interlock.ans.net id AB20646 (InterLock SMTP Gateway 3.0 for bgp@ans.net); Fri, 28 Feb 1997 14:28:40 -0500
Received: by interlock.ans.net (Internal Mail Agent-1); Fri, 28 Feb 1997 14:28:40 -0500
Message-Id: <199702281926.OAA07381@brookfield.ans.net>
To: Vince Fuller <vaf@wr.bbnplanet.com>
Cc: curtis@ans.net, Yakov Rekhter <yakov@cisco.com>, bgp@ans.net
Reply-To: curtis@ans.net
Subject: Re: IDR WG agenda
In-Reply-To: Your message of "Thu, 27 Feb 1997 16:03:38 PST." <CMM.0.90.2.857088218.vaf@hq.barrnet.net>
Date: Fri, 28 Feb 1997 14:26:33 -0500
From: Curtis Villamizar <curtis@ans.net>
Sender: owner-idr@merit.edu
Precedence: bulk

In message <CMM.0.90.2.857088218.vaf@hq.barrnet.net>, Vince Fuller writes:
> 
>     I meant a yet to be defined BGP community.  Making it a well known
>     community would help.
> 
> This would work, but there are still some situations where a new attribute
> would be preferred. The specific case that occurs to me is that of a provider
> who has multiple routers which are colocated with an exchange point. Right
> now, those routers could believe MED and the overwrite it on iBGP output but
> they still have to trust that the received MEDs aren't unreasonably high so
> as to cause them to prefer an iBGP route and break shortest-exit. There is
> an easy workaround if the routers at the IXP are dedicated -- always prefer
> external information over external information -- and a less obvious and less
> reliable general workaround -- overwrite the MED with a value larger than
> anything that a peer will ever send (but how to determine that value). But
> from an archtectural standpoint, having a separate "local-metric" attribute,
> used as a tie-breaker if equal MEDs are encountered and only sent or received
> on eBGP sessions, would be cleaner. 


Vince,

First of all, within an IGP, all peering to a given AS should be
configured consistently wrt incoming MED and IBGP.  If at one peering
MED is accepted and injected into IBGP, this should happen at all
peerings.  Either than or drop MED before injecting into IBGP.

In the router, MED can be considered to choose the best next hop from
any peering with an external AS and then get dropped before injecting
into IBGP and before comparing against any IBGP route.  The second
condition, drop MED from consideration before comparing against any
IBGP route is there so the border router makes the same decision as
the router one hop in from the border (so you don't get a routing
loop).  If that isn't being done, you have to ask your vendor to fix
their implementation.  No fix to BGP is needed.

I think the rewording of BGP4 is unclear about this.  An
implementation MAY compare MEDs from all EBGP peers even if the MED
will be dropped before passing into IBGP.  Perhaps that should even be
a SHOULD rather than a MAY.  If the EBGP learned MED is not being
passed into IBGP, the EBGP learned MED MUST NOT be used in the
comparison with an IBGP learned route.  To do so may cause a routing
loop.

Curtis


5.1.4   MULTI_EXIT_DISC


   The MULTI_EXIT_DISC attribute may be used on external (inter-AS)
   links to discriminate among multiple exit or entry points to the same
   neighboring AS.  The value of the MULTI_EXIT_DISC attribute is a four
   octet unsigned number which is called a metric.  All other factors
   being equal, the exit or entry point with lower metric should be
   preferred.  If received over external links, the MULTI_EXIT_DISC
   attribute MAY be propagated over internal links to other BGP speakers
   within the same AS.  The MULTI_EXIT_DISC attribute received from a
   neighboring AS MUST NOT be propagated to other neighboring ASs.

   A BGP speaker MUST IMPLEMENT a mechanism based on local configuration
   which allows the MULTI_EXIT_DISC attribute to be removed from a
   route.  This MAY be done either prior to or after determining the
   degree of preference of the route and performing route selection
   (decision process phases 1 and 2).

   An implementation MAY also (based on local configuration) alter the
   value of the MULTI_EXIT_DISC attribute received over an external
   link.  If it does so, it shall do so prior to determining the degree
   of preference of the route and performing route selection (decision
   process phases 1 and 2).


	This last two paragraphs may need to be changed to indicate a case
	where an implementation may become prone to routing loops if
	the wrong behavior is chosen.  In other words, lets constrain
	this according to the conditions above to only allow
	implementations that will remain loop free, but allow EBGP MED
	to be compared to satisfy Vince's requirement (which is what
	gated does and what we've been asking vendors to do anyway).
	Change to:

   A BGP speaker MUST IMPLEMENT a mechanism based on local
   configuration which allows the MULTI_EXIT_DISC attribute to be
   removed from a route.  An implementation MAY also (based on local
   configuration) alter the value of the MULTI_EXIT_DISC attribute
   received over an external link.

   When performing route comparisons, an altered or removed value of
   MULTI_EXIT_DISC, as specified by local configuration, MAY be used
   in external (inter-AS) routes that were learned directly (from
   external BGP peers).  When comparing a directly learned external
   route (external BGP) to a route learned from an internal BGP peer,
   the same MULTI_EXIT_DISC for the external route that will be passed
   to internal BGP peers must be used in the comparison.  All route
   comparisons will be made according to the procedures in the degree
   of preference of the route and route selection (decision
   process phases 1 and 2) described in section 9.1.

	What this does is make MED more like LOCAL-PREF in that the
	MED used in Adj-RIB-In selection need not be as constrained as
	the MED used in phase 2 of route selection.  In "9.1.1 Phase
	1: Calculation of Degree of Preference" we have:

   For each newly received or replacement feasible route, the local BGP
   speaker shall determine a degree of preference.  If the route is
   learned from an internal peer, the value of the LOCAL_PREF attribute
   shall be taken as the degree of preference.  If the route is learned
   from an external peer, then the degree of preference shall be
   computed based on preconfigured policy information and used as the
   LOCAL_PREF value in any IBGP readvertisement.  The exact nature of
   this policy information and the computation involved is a local
   matter.  The local speaker shall then run the internal update process
   of 9.2.1 to select and advertise the most preferable route.

	We could clarify by changing to:

   For each newly received or replacement feasible route, the local
   BGP speaker shall determine a degree of preference.  If the route
   is learned from an internal peer, the value of the LOCAL_PREF
   attribute shall be taken as the degree of preference and the
   MULTI_EXIT_DISC attribute shall be used in phase 2 of the selection
   process.

   If the route is learned from an external peer, then the degree of
   preference shall be computed based on preconfigured policy
   information and used as the LOCAL_PREF value in any IBGP
   readvertisement.  The exact nature of this policy information and
   the computation involved is a local matter.  If the route is
   learned from an external peer, the MULTI_EXIT_DISC may be altered
   or removed.  The MULTI_EXIT_DISC before or after alteration or
   removal may be used in Adj-RIB-In degree of preference comparisons.
   The MULTI_EXIT_DISC value used in phase 2 comparisons MUST be the
   same MULTI_EXIT_DISC value (or lack of) advertised to internal BGP
   neighbors if the route were to be selected.

   The local speaker shall then run the internal update process of
   9.2.1 as described above to select and advertise the most
   preferable route.

   Failure to ue the same LOCAL_PREF and MULTI_EXIT_DISC values in
   phase 2 comparisons as is advertised to internal BGP neighbors can
   result in routing loops and is therefore prohibited.

	Of course, I have the bad habit of using shorthand terms
	like EBGP peer.  In keeping with ISO practices of making
	sure standards bear as little resemblence as possible to
	terms used in the field, please replace any such oversights
	with the appropriately obscure and wordy BGP4 RFC-speak.  :(
	For example, can I say internal peer?