Re: diffs to draft-ietf-idr-bgp4-03.txt
"John G. Scudder" <jgs@ieng.com> Tue, 17 September 1996 17:32 UTC
Received: from cnri by ietf.org id aa26128; 17 Sep 96 13:32 EDT
Received: from merit.edu by CNRI.Reston.VA.US id aa29963; 17 Sep 96 13:32 EDT
Received: (from daemon@localhost) by merit.edu (8.7.5/merit-2.0) id MAA17858
for idr-outgoing; Tue, 17 Sep 1996 12:27:04 -0400 (EDT)
Received: from interlock.ans.net (interlock.ans.net [147.225.5.5]) by
merit.edu (8.7.5/merit-2.0) with SMTP id MAA17853 for <bgp@merit.edu>;
Tue, 17 Sep 1996 12:27:00 -0400 (EDT)
Received: by interlock.ans.net id AA13712
(InterLock SMTP Gateway 3.0 for bgp@ans.net);
Tue, 17 Sep 1996 12:26:57 -0400
Received: by interlock.ans.net (Internal Mail Agent-1);
Tue, 17 Sep 1996 12:26:57 -0400
Message-Id: <v03007805ae63c2689ddb@[141.211.162.142]>
In-Reply-To: <199609140420.AAA21003@brookfield.ans.net>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Date: Tue, 17 Sep 1996 12:28:18 -0400
To: curtis@ans.net
From: "John G. Scudder" <jgs@ieng.com>
Subject: Re: diffs to draft-ietf-idr-bgp4-03.txt
Cc: bgp@ans.net, curtis@ans.net
Sender: owner-idr@merit.edu
Precedence: bulk
At 12:20 AM -0400 9/14/96, Curtis Villamizar wrote:
>Yakov et al.
>
>Here are some suggested changes to the BGP4 draft. Briefly, there are
>a few separable issues:
>
> 1. Some wording changes (ie: hunk 1, 6, 7, 8, 14)
Comments on these interspersed with your text below. Note that I don't
think that hunk 8 is just a wording change (see comment below).
> 2. Proposing "BGP resynchronization". See below.
I'm inclined to agree with Yakov wrt resynchronization. I'll comment on
this in a later message.
--John
P.S.: I assume that the spelling will be fixed up on any changes which are
incorporated...
>***************
>*** 1208,1213 ****
>--- 1256,1275 ----
> The same attribute cannot appear more than once within the Path
> Attributes field of a particular UPDATE message.
>
>+ The manditory category refers to a field which must be present in
>+ both IBGP and EBGP exchanges. Attributes classified as optional
>+ for the purpose of the protocol extension mechanism may be purely
>+ discresionary, or discresionary, required, or disallowed in certain
>+ contexts.
>+
>+ attribute EBGP IBGP
>+ ORIGIN manditory manditory
>+ AS_PATH manditory manditory
>+ NEXT_HOP discresionary discresionary
No. NEXT_HOP is mandatory. It's too late to change this even if we wanted
to; other things like route reflection rely on NEXT_HOP.
>+ MULTI_EXIT_DISC discresionary discresionary
>+ LOCAL_PREF disallowed required
>+ ATOMIC_AGGREGATE discresionary discresionary
I've said in earlier messages that ATOMIC_AGGREGATE is not discretionary.
I now see that this is ambiguous. 5.1.6 says "shall":
"the local system shall attach the ATOMIC_AGGREGATE attribute to
the route"
but 9.1.4 says "should":
"it should add ATOMIC_AGGREGATE attribute to the route"
Clearly either 5.1.6 or 9.1.4 needs to be fixed.
If we decide on "shall" (my vote is for this) then this table entry is
wrong, since ATOMIC_AGGREGATE is mandatory under certain circumstances,
which don't relate to EBGP/IBGP. Unless you can think of a way to fix the
table, I suggest just explaining in the attribute's 5.1.x section exactly
when it is mandatory, forbidden or discretionary. The table could be left
and ATOMIC_AGGREGATE could be flagged with "see below", or the table could
be removed.
If we decide on "should" then the table is fine.
>+ AGGREGATOR discresionary discresionary
>
>
> 5.1 Path Attribute Usage
>***************
>*** 1342,1348 ****
> preferred. If received over external links, the MULTI_EXIT_DISC
> attribute may be propagated over internal links to other BGP speakers
> within the same AS. The MULTI_EXIT_DISC attribute is never
>! propagated to other BGP speakers in neighboring AS's.
>
>
> 5.1.5 LOCAL_PREF
>--- 1404,1410 ----
> preferred. If received over external links, the MULTI_EXIT_DISC
> attribute may be propagated over internal links to other BGP speakers
> within the same AS. The MULTI_EXIT_DISC attribute is never
>! propagated from IBGP to other BGP speakers in neighboring AS's.
Personally, I think this is more confusing, not clearer. I think I finally
figured out what you are trying to do here. If the point is to make it
clear that the MED may be *originated* in EBGP updates (but not propagated
from EBGP to EBGP), the proposed text is also not quite complete since it
doesn't cover the case in which my border router is propagating routes
between two external peers in two different neighbor ASes, without the
routes passing into IBGP and out again. (Strictly speaking, it also
doesn't cover the hypothetical case where we use IGP flooding instead of
IBGP.)
How about this:
The MULTI_EXIT_DISC attribute received from a neighboring AS
is never propagated to BGP speakers in other neighboring ASes.
[...]
>***************
>*** 2199,2213 ****
> operating on all new or unfeasible routes contained within it.
>
> For each newly received or replacement feasible route, the local BGP
>! speaker shall determine a degree of preference. If the route is
>! learned from a BGP speaker in the local autonomous system, either the
> value of the LOCAL_PREF attribute shall be taken as the degree of
>! preference, or the local system shall compute the degree of
>! preference of the route based on preconfigured policy information. If
>! the route is learned from a BGP speaker in a neighboring autonomous
>! system, then the degree of preference shall be computed based on
>! preconfigured policy information. The exact nature of this policy
>! information and the computation involved is a local matter. The
>
>
>
>--- 2313,2327 ----
> operating on all new or unfeasible routes contained within it.
>
> For each newly received or replacement feasible route, the local BGP
>! speaker shall determine a degree of preference. If the route is
>! learned from a BGP speaker in the local autonomous system, the
> value of the LOCAL_PREF attribute shall be taken as the degree of
>! preference. If the route is learned from a BGP speaker in a
>! neighboring autonomous system, then the degree of preference shall
>! be computed based on preconfigured policy information and used as
>! the LOCAL_PREF value in any IBGP readvertisement. The exact nature
>! of this policy information and the computation involved is a local
>! matter. The
I like this. I want to point out though, that this is more than an
editorial change. It's actually more restrictive than the text it replaces
since it makes it illegal to apply policy to IBGP routes. Insofar as it's
probably a bad idea to apply policy to IBGP routes, this is good. But, I
don't know that this will make a difference to anyone's implementation -- I
suspect that this will just be ignored. A compromise would allow local
policy to be applied to IBGP, but to make it mandatory to respect
LOCAL_PREF by default (local policy for IBGP could be optional).
>***************
>*** 2243,2259 ****
> the local BGP speaker doesn't have a route in its Loc-RIB, the BGP
> route SHOULD be excluded from the Phase 2 decision function.
>
>! For each set of destinations for which a feasible route exists in the
>! Adj-RIBs-In, the local BGP speaker shall identify the route that has:
>!
>! a) the highest degree of preference of any route to the same set
>! of destinations, or
>!
>! b) is the only route to that destination, or
>!
>! c) is selected as a result of the Phase 2 tie breaking rules
>! specified in 9.1.2.1.
>!
>
> The local speaker SHALL then install that route in the Loc-RIB,
> replacing any route to the same destination that is currently being
>--- 2357,2416 ----
> the local BGP speaker doesn't have a route in its Loc-RIB, the BGP
> route SHOULD be excluded from the Phase 2 decision function.
>
>! It is critical that routers within an AS do not make conflicting
>! decisions regarding route selection that would cause forwarding
>! loops to occur (routers pointing traffic as each other or in a
>! cycle). BGP speakers will consider the following attributes in
The parenthetical explanation of forwarding loops isn't quite right.
Routers tend to "point traffic at each other" in the course of normal
correct operation. I would suggest removing the parenthetical altogether
-- the concept of a forwarding loop is sufficiently well-known that it
doesn't have to be spelled out.
>! determining preference and will consider no others.
^^^^^^^^^^^^^^^^^^^^^^^^^^^
This is my favorite change. Keep it even if no others are kept!
>!
>! 1. NEXT_HOP. The next hop must be reachable, or if NEXT_HOP is
>! not provided, the advertising router must be reachable)
This should just be "the next hop must be reachable." NEXT_HOP is
mandatory (see earlier comment) and therefore must be provided.
>!
>! 1. LOCAL_PREF.
>!
>! 2. MULTI_EXIT_DISC. This comparison is applicable only when
>! comparing MULTI_EXIT_DISC received from the same
>! neighboring AS, including those received from the same
>! neighboring AS and passed via IBGP.
>!
>! 3. Internal routing protocol cost. Lower costs are preferred.
>! Select the route that has the lowest cost (interior
>! distance) to the entity depicted by the NEXT_HOP attribute
>! of the route.
>!
>! 4. Advertising router BGP Identifier. If at least one of the
>! candidate routes was advertised by the BGP speaker in a
>! neighboring autonomous system, select the route that was
>! advertised by the BGP speaker in a neighboring autonomous
>! system whose BGP Identifier has the lowest value among all
>! other BGP speakers in neighboring autonomous systems.
>! Otherwise, select the route that was advertised by the BGP
>! speaker whose BGP Identifier has the lowest value.
>!
>! The MULTI_EXIT_DISC may be dropped when passing a route via IBGP.
>! To prevent routing loops a route with a MULTI_EXIT_DISC (if the
>! routes and received from the same neighboring AS and therefore the
>! comparison is applicable) is preferred over one without.
>!
>! The outcome of a comparison must be independent of the order in
>! which routes are placed in the Adj-Rib-In and Local-Rib. Due to
>! difference in the backlog of propogated routes among IBGP peers, the
>! same set of routes may arrive in different orders on differnt
>! routers. If routes are compared in arbitrary order, under certain
>! conditions routing loops can occur. This is a consequence of the
>! use of MULTI_EXIT_DISC in comparing routes received from the same
>! neighboring AS but not considering MULTI_EXIT_DISC when comparing
>! routes received from differing neighboring AS.
>!
>! The comparison algorithm must insure a deterministic decision
>! outcome. To accomplish this, routes are compared in a constrained
>! order. First all routes are checked for feasibility, insuring that
>! the NEXT_HOP is reachable. All routes from the same neighboring AS
>! are first compared using the criteria above, LOCAL_PREF,
>! MULTI_EXIT_DISC, internal routing protocol cost, and Advertising
>! router IP address. Then the best route selected from each
>! neighboring AS are compared using the criteria, LOCAL_PREF, internal
>! routing protocol cost, and Advertising router BGP Identifier.
This paragraph is hard to understand. It seems to have been cut-n-pasted a
few times -- at least two things are done "first." Also, it seems to
reiterate the steps just described. Is this necessary? I think it needs
to be rewritten.
As you know, I'm also not fond of specifying the algorithm in this fashion.
Others are certainly usable to come up with the same result. A disclaimer
that any algorithm with the same results is acceptable is called for.
As you know I submitted some other text that uses some bits of pseudo-code
to explain the same thing. I know you don't like it. Obviously, I do.
Mainly, I prefer to do it that way because I think that the above text kind
of obscures the desired outcome -- you have to add the "must be independent
of order" and "in a constrained order" paragraphs to fix this up. I prefer
to make it explicit in the actual specification of the steps rather than as
a fix-up note.
That said, I have no idea what anyone else likes, I think that your text is
probably also correct (just more confusing, to me) and I'm pretty much worn
down to the point where I don't care to debate this any more. But, I
wanted to point out to the WG that there has already been another
alternative proposed here.
I do like the fact that you eliminated the idea of tie-breaking and just
put it all in 9.1.2.
> The local speaker SHALL then install that route in the Loc-RIB,
> replacing any route to the same destination that is currently being
[...]
>***************
>*** 2425,2451 ****
> the less specific route.
>
> If a BGP speaker receives overlapping routes, the Decision Process
>! shall take into account the semantics of the overlapping routes. In
>! particular, if a BGP speaker accepts the less specific route while
>! rejecting the more specific route from the same peer, then the
>! destinations represented by the overlap may not forward along the ASs
>! listed in the AS_PATH attribute of that route. Therefore, a BGP
>! speaker has the following choices:
>!
>! a) Install both the less and the more specific routes
>!
>! b) Install the more specific route only
>!
>! c) Install the non-overlapping part of the less specific
>! route only (that implies de-aggregation)
>!
>! d) Aggregate the two routes and install the aggregated route
>!
>! e) Install the less specific route only
>!
>! f) Install neither route
>
>! If a BGP speaker chooses e), then it should add ATOMIC_AGGREGATE
> attribute to the route. A route that carries ATOMIC_AGGREGATE
> attribute can not be de-aggregated. That is, the NLRI of this route
>
>--- 2543,2554 ----
> the less specific route.
>
> If a BGP speaker receives overlapping routes, the Decision Process
>! shall consider both routes based on configured acceptance policy.
>! If both are accepted the Decision Process will install both the
>! less and the more specific routes or aggregate the two routes and
>! install the aggregated route.
>
>! If a BGP speaker chooses to aggregate, then it should add
>ATOMIC_AGGREGATE
> attribute to the route. A route that carries ATOMIC_AGGREGATE
> attribute can not be de-aggregated. That is, the NLRI of this route
Note earlier comments regarding inconsistency between "shall" and "should"
for ATOMIC_AGGREGATE. (This is a pre-existing problem with the spec and
not related to Curtis's changes.)
[...]
>***************
>*** 2500,2531 ****
> When a BGP speaker receives a new route from a BGP speaker in a
> neighboring autonomous system, it shall advertise that route to all
> other BGP speakers in its autonomous system by means of an UPDATE
>! message if any of the following conditions occur:
>!
>! 1) the degree of preference assigned to the newly received route
>! by the local BGP speaker is higher than the degree of preference
>! that the local speaker has assigned to other routes that have been
>! received from BGP speakers in neighboring autonomous systems, or
>!
>! 2) there are no other routes that have been received from BGP
>!
>!
>!
>! Expiration Date February 1996 [Page 42]
>!
>!
>!
>!
>!
>! INTERNET DRAFT August 1996
>!
>!
>! speakers in neighboring autonomous systems, or
>!
>! 3) the newly received route is selected as a result of breaking a
>! tie between several routes which have the highest degree of
>! preference, and the same destination (the tie-breaking procedure
>! is specified in 9.2.1.1).
>
> When a BGP speaker receives an UPDATE message with a non-empty
> WITHDRAWN ROUTES field, it shall remove from its Adj-RIB-In all
>--- 2603,2610 ----
> When a BGP speaker receives a new route from a BGP speaker in a
> neighboring autonomous system, it shall advertise that route to all
> other BGP speakers in its autonomous system by means of an UPDATE
>! message if this route has become the best route installed in the
>! Local-Rib according to the route selection rules in 9.1.2.
>
> When a BGP speaker receives an UPDATE message with a non-empty
> WITHDRAWN ROUTES field, it shall remove from its Adj-RIB-In all
I like this change a lot. I think that it could be simplified even a bit
more by removing "the best route", like so: "...if this route has been
installed in the Local-Rib according..." -- after all, it will only be
installed in the Loc-RIB if it's chosen as the best route.
--
John Scudder email: jgs@ieng.com
Internet Engineering Group, LLC phone: (313) 669-8800
122 S. Main, Suite 280 fax: (313) 669-8661
Ann Arbor, MI 41804 www: http://www.ieng.com
- diffs to draft-ietf-idr-bgp4-03.txt Curtis Villamizar
- Re: diffs to draft-ietf-idr-bgp4-03.txt Yakov Rekhter
- Re: diffs to draft-ietf-idr-bgp4-03.txt Curtis Villamizar
- Re: diffs to draft-ietf-idr-bgp4-03.txt Yakov Rekhter
- Re: diffs to draft-ietf-idr-bgp4-03.txt Yakov Rekhter
- Re: diffs to draft-ietf-idr-bgp4-03.txt David J. LeRoy
- Re: diffs to draft-ietf-idr-bgp4-03.txt John G. Scudder
- Re: diffs to draft-ietf-idr-bgp4-03.txt John G. Scudder
- Re: diffs to draft-ietf-idr-bgp4-03.txt Curtis Villamizar
- Re: diffs to draft-ietf-idr-bgp4-03.txt Curtis Villamizar
- Re: diffs to draft-ietf-idr-bgp4-03.txt Curtis Villamizar
- RE: diffs to draft-ietf-idr-bgp4-03.txt NITTMANN Michael (MSMail)