RE: [Idr] comments on idr-bgp4-20
Pekka Savola <pekkas@netcore.fi> Thu, 25 September 2003 05:10 UTC
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id BAA24946; Thu, 25 Sep 2003 01:10:35 -0400 (EDT)
Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1A2OOl-0000ME-00; Thu, 25 Sep 2003 01:10:39 -0400
Received: from ietf.org ([132.151.1.19] helo=optimus.ietf.org) by ietf-mx with esmtp (Exim 4.12) id 1A2OOl-0000M7-00; Thu, 25 Sep 2003 01:10:39 -0400
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1A2OO9-000337-Eo; Thu, 25 Sep 2003 01:10:01 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1A2ONc-00032g-BB for idr@optimus.ietf.org; Thu, 25 Sep 2003 01:09:28 -0400
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id BAA24903 for <idr@ietf.org>; Thu, 25 Sep 2003 01:09:21 -0400 (EDT)
Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1A2ONY-0000LD-00 for idr@ietf.org; Thu, 25 Sep 2003 01:09:24 -0400
Received: from netcore.fi ([193.94.160.1]) by ietf-mx with esmtp (Exim 4.12) id 1A2ONX-0000L6-00 for idr@ietf.org; Thu, 25 Sep 2003 01:09:23 -0400
Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h8P58gL30140; Thu, 25 Sep 2003 08:08:43 +0300
From: Pekka Savola <pekkas@netcore.fi>
To: Susan Hares <shares@nexthop.com>
cc: idr@ietf.org
Subject: RE: [Idr] comments on idr-bgp4-20
In-Reply-To: <BE36D687C8CBFA4AB76F5CD3E3C0FB4E010CD149@aa-exchange1.corp.nexthop.com>
Message-ID: <Pine.LNX.4.44.0309250808180.30104-100000@netcore.fi>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Sender: idr-admin@ietf.org
Errors-To: idr-admin@ietf.org
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
List-Archive: <https://www1.ietf.org/mail-archive/working-groups/idr/>
Date: Thu, 25 Sep 2003 08:08:42 +0300
On Wed, 24 Sep 2003, Susan Hares wrote: > For context on the message, could you > tell me are you: > > 2) Operator using BGP Operator. > -----Original Message----- > From: Pekka Savola [mailto:pekkas@netcore.fi] > Sent: Wednesday, September 24, 2003 3:50 AM > To: idr@ietf.org > Subject: [Idr] comments on idr-bgp4-20 > > > Hi, > > [note, I'm not subscribed to the idr list] > > Alex solicited ops-dir for folks to review bgp4-20 spec. After almost > half a year, I finally found the time to look at it. Comments below. I > tried to look at the issues list draft-ietf-idr-bgp-issues-01.txt, but at > least on the quick glance, I didn't find major overlaps. > > Overview: > - the spec is in pretty good shape, but I'm not sure if all the features > listed are really full standard material.. it might be better to separate > some of them (e.g. anything relating to aggregation) to a separate doc (so > that we'd have a much tigher and better real base spec), and recycle it at > DS, but that might be much more effort.. not sure if something like that > would be worth it. > > General thoughts on going to full standard: > - If we want to go for full standard, I have to question the usefulness of > several BGP features: > - AS_SET AS_PATH attribute > - whole route aggregation concept (nowadays, it seems to be causing > more trouble than it's worth -- instead of proper advertisements, just > automatically aggregate instead of fixing the advertisements to begin > with). Maybe pver 10 years ago it may have been relevant if all the edge > BGP routers might not support BGP-4 or CIDR.. > - about N++ different cases of third-party nexthop usage. > > As a general observation from the text: > - I skipped section 8 on FSM, as it was really boring, and I'm not sure > whehter it's of any use keeping it in the base spec. > - I got major headache when trying to read the aggregating rules > especially in section 9.2.2.2 (and that was supposed to be the _simple_ > rules :-(!!!) and ended up skipping them. > > substantial > ----------- > > 1) Section 3 on Summary of the Operations, states: > > In the context of this document we assume that a BGP speaker adver- > tises to its peers only those routes that it itself uses (in this > context a BGP speaker is said to "use" a BGP route if it is the most > preferred BGP route and is used in forwarding). All other cases are > outside the scope of this document. > > this "usability" requirement is in conflict with section 4.3 on UPDATE > message format: > > An UPDATE message is used to advertise feasible routes sharing common > path attributes to a peer, [...] > > ==> in particular note that "feasible" is a superset of "only those routes > the router itself uses" (compare to "those routes the router itself _could_ > use if they were the most preferred ones"), right? > > ==> also note that the former definition is (IMHO) operationally unsound. > It prevents you from advertising P-t-P addresses and loopbacks (which also > reside in IGP) using BGP, because the IGP distance is lower than BGP and > "BGP route is never used in forwarding". (depending on how you interpret > that). IMHO, this really needs to be changed to be OK. > > 2) The OPEN message format is confusing. Why on earth do you have those > empty bits there? Are they null-padding or just a typographical convention > for the clarity of the diagram (the latter)? > > 0 1 2 3 > 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 > +-+-+-+-+-+-+-+-+ > | Version | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | My Autonomous System | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | Hold Time | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | BGP Identifier | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | Opt Parm Len | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > | | > | Optional Parameters (variable) | > | | > +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > > ==> I strongly recommend to either: > 1) go back to standard format, or > 2) rephrase the message format similar with UPDATE messages, i.e., drop all > the pretense about field lengths, bits 0-31, etc. > > 3) there is significant confusion on what "locally originated" means. It is > not clear whether this refers to "local to the BGP speaker (i.e., the same > system)" or "local to the routing domain (i.e., the same AS)". This must be > clarified in the text below, and the text should be reviewed. > > 5.1.3 NEXT_HOP > > The NEXT_HOP is a well-known mandatory attribute that defines the IP > address of the router that SHOULD be used as the next hop to the des- > tinations listed in the UPDATE message. The NEXT_HOP attribute is > calculated as follows. > > 1) When sending a message to an internal peer, if the route is not > locally originated the BGP speaker SHOULD NOT modify the NEXT_HOP > attribute, unless it has been explicitly configured to announce > its own IP address as the NEXT_HOP. When announcing a locally > originated route to an internal peer, the BGP speaker SHOULD use > as the NEXT_HOP the interface address of the router through which > the announced network is reachable for the speaker; if the route > is directly connected to the speaker, or the interface address of > the router through which the announced network is reachable for > the speaker is the internal peer's address, then the BGP speaker > SHOULD use for the NEXT_HOP attribute its own IP address (the > address of the interface that is used to reach the peer). > > 4) there appears to have been added some bogosities about inter-AS traffic > engineering in section 9.3: > > - If the local AS appears in the AS path of the new route being > considered, then that new route can not be viewed as better than > any other route (provided that the speaker is configured to accept > such routes). If such a route were ever used, a routing loop could > result. > > and: > > The ability of a BGP speaker to include more than one instance of > its own AS in the AS_PATH attribute for the purpose of inter-AS > traffic engineering. > > ==> what do these have to do with base spec? I don't think they're > operationally feasible or used. Remove any indication that it's OK to > receive your own AS in the routes you receive!! > > 5) the reasoning for this section is not clear. It seems to be two parts > (the second one starting at "These routes..."). It seems partially, at > least for the second part, a set of operational guidelines.. which probably > don't have anything to do with the BGP spec! > > 9.4 Originating BGP routes > > A BGP speaker may originate BGP routes by injecting routing informa- > tion acquired by some other means (e.g. via an IGP) into BGP. A BGP > speaker that originates BGP routes assigns the degree of preference > to these routes by passing them through the Decision Process (see > Section 9.1). These routes MAY also be distributed to other BGP > speakers within the local AS as part of the update process (see Sec- > tion 9.2). The decision whether to distribute non-BGP acquired routes > within an AS via BGP or not depends on the environment within the AS > (e.g. type of IGP) and SHOULD be controlled via configuration. > > 6) references; note that as the Security Consideration section is a bit > short, the vulnerability analysis should be a *NORMATIVE* reference. > However, one must ask here and with RFC2385 as well whether normative > references are OK from Full Standard to a lower maturity level. Unless > this has already been investigated, it may be worth checking with the ADs > that this is OK. > > BGP vulnerabilities analysis is discussed in [XXX]. > > Normative References > > [RFC2385] Heffernan, A., "Protection of BGP Sessions via the TCP MD5 > Signature Option", RFC2385, August 1998. > > Non-normative References > > [XXX] Murphy, S., "BGP Security Vulnerabilities Analysis", draft- > ietf-idr-bgp-vuln-00.txt, work in progress > > > semi-substantial > ---------------- > > ==> I'd add "Decision Process", "Routing Table", and "BGP route" in the > terminology. (Especially the last one is a bit shady, see more comments > later.) > > Adj-RIB-In > Adj-RIBs-In > Adj-RIB-Out > Adj-RIBs-Out > > ==> in some places, one seems to use "Adj-RIBs-{In,Out}", in some others, > "Adj-RIB-{In,Out}". The plural/singular form should be consistent. > > BGP speaker > A router that implements BGP. > > ==> uhh, no. Throughout the text "BGP speaker" is used to refer to > BGP-*enabled* routers. Implementation is a required but not sufficient > condition to that. Consider rewording to something like: > > BGP speaker > A router that implements and has been configured to use BGP. > > .... > > BGP Identifier: > > This 4-octet unsigned integer indicates the BGP Identifier of > the sender. A given BGP speaker sets the value of its BGP Iden- > tifier to an IP address assigned to that BGP speaker. The > value of the BGP Identifier is determined on startup and is the > same for every local interface and every BGP peer. > > ==> the last sentence is confusing "same for every local interface".. yeah, > right, but interfaces are irrelevant here. Also, this ignores the fact > that it's perfectly legal to change BGP identifiers after startup but prior > to establishing a BGP session. Consider rewording to something > like: > > This 4-octet unsigned integer indicates the BGP Identifier of > the sender. A given BGP speaker sets the value of its BGP Iden- > tifier to an IP address assigned to that BGP speaker. The > value of the BGP Identifier is determined prior to establishing > BGP sessions and is the same for every session the BGP speaker. > > ... > > Path Attributes: > > A variable length sequence of path attributes is present in > every UPDATE message, except for an UPDATE message that carries > only the withdrawn routes. Each path attribute is a triple > <attribute type, attribute length, attribute value> of variable > length. > > ==> should a simple diagram be added here, just like in section 4.2? > > The high-order bit (bit 0) of the Attribute Flags octet is the > Optional bit. It defines whether the attribute is optional (if > set to 1) or well-known (if set to 0). > > ==> this and the following descriptions on attributes are out of place if > you have not described the remaining categories (because they're used when > describing the attributes in the next page), that is, discretionary and > mandatory. Consider rewording the above to something like: > > The high-order bit (bit 0) of the Attribute Flags octet is the > Optional bit. It defines whether the attribute is optional (if > set to 1) or well-known (if set to 0). Well-known attributes can > be either mandatory (MUST be included in every UPDATE message that > contains NLRI), or discretionary (inclusion not required). > > ... > > All well-known attributes MUST be passed along (after proper updat- > ing, if necessary) to other BGP peers. > > ==> LOCAL-PREF is an exception here. I also note that the BGP's definition > of "transitive" is different ("unrecognized optional attributes are only > passed onward when the transitive bit is set, but recognized optional > attributes are always passed onward") from the apparent language ("always passed > along"), and might use a bit of clarification. > > ... > > b) When a given BGP speaker advertises the route to an external > peer, then the advertising speaker updates the AS_PATH attribute > as follows: > > 1) if the first path segment of the AS_PATH is of type > AS_SEQUENCE, the local system prepends its own AS number as the > last element of the sequence (put it in the leftmost position). > If the act of prepending will cause an overflow in the AS_PATH > segment, i.e. more than 255 ASs, it is legal to prepend a new > segment of type AS_SEQUENCE and prepend its own AS number to > this new segment. > > 2) if the first path segment of the AS_PATH is of type AS_SET, > the local system prepends a new path segment of type > AS_SEQUENCE to the AS_PATH, including its own AS number in that > segment. > > ==> the text here seems confusing because it is not clear whether things are > added at the beginning or at the end of the segments, for example: > > "the _last_ element sequence (put it in the _leftmost_ position)" > > .. I think this requires more precise language on which end of the segment > these values are really inserted. I.e., is it "physically" (in memory) > left-to-right (from more recent to more distant), or just interpreted that > way? This has significant impact on the last sentence of section 1) in case > of overflows and the like. > > Whenever the modification of the AS_PATH attribute calls for includ- > ing or prepending the AS number of the local system, the local system > MAY include/prepend more than one instance of its own AS number in > the AS_PATH attribute. This is controlled via local configuration. > > ==> what about prepending/adding someone else's AS? Is this relevant enough > to be considered? > > A BGP speaker that receives a route with the ATOMIC_AGGREGATE > attribute MUST NOT make any NLRI of that route more specific (as > defined in 9.1.4) when advertising this route to other BGP speakers. > > ==> the second line seems out-of-place here. I assume it means that the > router must not perform automatic de-aggregation, not that the router must > not have more specific routes to advertise. In any case, a rewording might > be in order, e.g. like: > > A BGP speaker that receives a route with the ATOMIC_AGGREGATE > attribute MUST NOT de-aggregate routes (as defined in 9.1.4); that is, > generating more specifics by de-aggregating a route is forbidden. > > (note that this actually makes the statement stronger, and more in line of > 9.1.4, that you should not do it even locally.) > > If the Length field of the message header is less than 19 or greater > than 4096, or if the Length field of an OPEN message is less than the > minimum length of the OPEN message, or if the Length field of an > UPDATE message is less than the minimum length of the UPDATE message, > or if the Length field of a KEEPALIVE message is not equal to 19, or > if the Length field of a NOTIFICATION message is less than the mini- > mum length of the NOTIFICATION message, then the Error Subcode is set > to Bad Message Length. The Data field contains the erroneous Length > field. > > ==> convert this to a bulleted list or something, so it's easier to read, > please! :-) > > The IP address in the NEXT_HOP MUST meet the following criteria to be > considered semantically correct: > > a) It MUST NOT be the IP address of the receiving speaker > > b) In the case of an EBGP where the sender and receiver are one IP > hop away from each other, either the IP address in the NEXT_HOP > MUST be the sender's IP address (that is used to establish the BGP > connection), or the interface associated with the NEXT_HOP IP > address MUST share a common subnet with the receiving BGP speaker. > > > ==> what's exactly the case for the first case of b)? Couldn't thid be > cleaned a bit (it seems to me that the second case is a superset of the > first)? In addition, I'd recommend rewording "one IP hop away" -- > that begs the question whether you mean "direct neighbors" or "just one > transit router". When thinking of it more, the former seems apparent, but a > different wording might flesh this out better. > > If a BGP speaker chooses to aggregate, then it SHOULD either include > all AS used to form the aggreagate in an AS_SET or add the > ATOMIC_AGGREGATE attribute to the route. This attribute is now pri- > marily informational. With the elimination of IP routing protocols > that do not support classless routing and the elimination of router > and host implementations that do not support classless routing, there > is no longer a need to deaggregate. Routes SHOULD NOT be de-aggre- > gated. A route that carries ATOMIC_AGGREGATE attribute in particular > MUST NOT be de-aggregated. That is, the NLRI of this route can not be > made more specific. Forwarding along such a route does not guarantee > that IP packets will actually traverse only ASs listed in the AS_PATH > attribute of the route. > > ==> this is confusing, especially the end, beginning from "That is, ..." and > should be reworded a bit (does it cover only deaggregation or more specifics > in general?), like: > > If a BGP speaker chooses to aggregate, then it SHOULD either include > all AS used to form the aggreagate in an AS_SET or add the > ATOMIC_AGGREGATE attribute to the route. This attribute is now pri- > marily informational. With the elimination of IP routing protocols > that do not support classless routing and the elimination of router > and host implementations that do not support classless routing, there > is no longer a need to deaggregate. > > Routes SHOULD NOT be de-aggregated. A route that carries ATOMIC_AGGREGATE > attribute in particular MUST NOT be de-aggregated. That is, one must not > generate more specifics of the NLRI of this route. Forwarding along > such a route does not guarantee > that IP packets will actually traverse only ASs listed in the AS_PATH > attribute of the route. However, accepting more specific advertisments > which an aggregate route covers is acceptable. > > ... > > A BGP speaker SHOULD NOT advertise a given feasible BGP route from > its Adj-RIB-Out if it would produce an UPDATE message containing the > same BGP route as was previously advertised. > > ==> and now we get back to the definition of "same BGP route"? Identical prefix > carried over BGP? Identical prefix carried over BGP with identical attributes?" > I'm assuming the former, otherwise (at least!) the above should be > clarified. > > Aggregation reduces the amount of information that a BGP speaker must > store and exchange with other BGP speakers. Routes can be aggregated > by applying the following procedure separately to path attributes of > like type and to the Network Layer Reachability Information. > > ==> the last line sounds really weird; consider: > - s/like/the same/ ? > - add more text after the NLRI, like "NLRI field" ? > > > > typos or other editorials > ------------------------- > > ==> there should not be hyphenation at the linebreaks. > > This document specifies the base behavior of the BGP protocol. This > behavior can and is modified by extention specifications. When the > protocol is extended the new behavior is fully documented in the > extention specifications. > > ==> s/extention/extension/ (two places) > > 3.2 Routing Information Bases > > The Routing Information Base (RIB) within a BGP speaker consists of > three distinct parts: > > ==> the subject is in plural form ("RIBs"), but the text begins as singular > ("The RIB"). Consider which one to use? > > a) ORIGIN (Type Code 1): > > b) AS_PATH (Type Code 2): > > ==> kill the extra spaces before ORIGIN and NEXT_HOP. > > 1) if the first path segment of the AS_PATH is of type > AS_SEQUENCE, the local system prepends its own AS number as the > last element of the sequence (put it in the leftmost position). > > ==> rewrite the last one as: > last element of the sequence, putting it in the leftmost position. > > A BGP speaker MUST IMPLEMENT a mechanism based on local configuration > > ==> s/IMPLEMENT/implement/ > > used by a BGP speaker's decision process to discriminate > > ==> s/decision process/Decision Process/ (similar in a few other places?) > > attribute needs to be cognizant of the fact that the actual path to > > ==> s/cognizant/aware/ > > 6. BGP Error Handling. > > ==> remove the periods from at the end of section titles, esp. in subctions > of section 6. > > issue a open collision dump [Event 23]. When the local > system receives a open collision dump event [Event 23], the > > ==> s/a open/an open/ (two times) > > installed in LocRib and will be excluded from the next phase of route > > ==> s/LocRib/Loc-RIB/ > > all AS used to form the aggreagate in an AS_SET or add the > > ==> s/aggreagate/aggregate/ > > is no longer a need to deaggregate. Routes SHOULD NOT be de-aggre- > gated. A route that carries ATOMIC_AGGREGATE attribute in particular > MUST NOT be de-aggregated. That is, the NLRI of this route can not be > > ==> pick either "deaggregate" or "de-aggregate" and stick to that. > > have the origin attribute with the value EGP. In all other case > > ==> s/case/cases/ > > > -- Pekka Savola "You each name yourselves king, yet the Netcore Oy kingdom bleeds." Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings _______________________________________________ Idr mailing list Idr@ietf.org https://www1.ietf.org/mailman/listinfo/idr
- [Idr] comments on idr-bgp4-20 Pekka Savola
- RE: [Idr] comments on idr-bgp4-20 Susan Hares
- Re: [Idr] comments on idr-bgp4-20 John G. Scudder
- RE: [Idr] comments on idr-bgp4-20 Pekka Savola
- Re: [Idr] comments on idr-bgp4-20 Pekka Savola
- Re: [Idr] comments on idr-bgp4-20 Jeffrey Haas
- Re: [Idr] comments on idr-bgp4-20 Pekka Savola
- Re: [Idr] comments on idr-bgp4-20 John G. Scudder
- Re: [Idr] comments on idr-bgp4-20 Pekka Savola
- Re: [Idr] comments on idr-bgp4-20 Dennis Ferguson
- Re: [Idr] comments on idr-bgp4-20 Yakov Rekhter
- Re: [Idr] comments on idr-bgp4-20 Pekka Savola
- Re: [Idr] comments on idr-bgp4-20 Pekka Savola
- Re: [Idr] comments on idr-bgp4-20 Yakov Rekhter
- Re: [Idr] comments on idr-bgp4-20 Pekka Savola
- Re: [Idr] comments on idr-bgp4-20 Yakov Rekhter
- Re: [Idr] comments on idr-bgp4-20 Dennis Ferguson
- Re: [Idr] comments on idr-bgp4-20 Pekka Savola
- Re: [Idr] comments on idr-bgp4-20 Pekka Savola
- Re: [Idr] comments on idr-bgp4-20 John G. Scudder
- RE: [Idr] comments on idr-bgp4-20 Susan Hares
- [Idr] issue 11.2: active route, again [Re: commen… Pekka Savola
- Re: [Idr] comments on idr-bgp4-20 Yakov Rekhter
- [Idr] issue 11.2: active route, again [Re: commen… Pedro Roque Marques
- Re: [Idr] issue 11.2: active route, again [Re: co… Pekka Savola
- Re: [Idr] comments on idr-bgp4-20 Pekka Savola
- Re: [Idr] issue 11.2: active route, again [Re: co… Pedro Roque Marques
- Re: [Idr] issue 11.2: active route, again [Re: co… Pekka Savola