RE: [Idr] comments on idr-bgp4-20

Pekka Savola <pekkas@netcore.fi> Thu, 25 September 2003 05:10 UTC

Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id BAA24946; Thu, 25 Sep 2003 01:10:35 -0400 (EDT)
Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1A2OOl-0000ME-00; Thu, 25 Sep 2003 01:10:39 -0400
Received: from ietf.org ([132.151.1.19] helo=optimus.ietf.org) by ietf-mx with esmtp (Exim 4.12) id 1A2OOl-0000M7-00; Thu, 25 Sep 2003 01:10:39 -0400
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1A2OO9-000337-Eo; Thu, 25 Sep 2003 01:10:01 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1A2ONc-00032g-BB for idr@optimus.ietf.org; Thu, 25 Sep 2003 01:09:28 -0400
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id BAA24903 for <idr@ietf.org>; Thu, 25 Sep 2003 01:09:21 -0400 (EDT)
Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1A2ONY-0000LD-00 for idr@ietf.org; Thu, 25 Sep 2003 01:09:24 -0400
Received: from netcore.fi ([193.94.160.1]) by ietf-mx with esmtp (Exim 4.12) id 1A2ONX-0000L6-00 for idr@ietf.org; Thu, 25 Sep 2003 01:09:23 -0400
Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h8P58gL30140; Thu, 25 Sep 2003 08:08:43 +0300
From: Pekka Savola <pekkas@netcore.fi>
To: Susan Hares <shares@nexthop.com>
cc: idr@ietf.org
Subject: RE: [Idr] comments on idr-bgp4-20
In-Reply-To: <BE36D687C8CBFA4AB76F5CD3E3C0FB4E010CD149@aa-exchange1.corp.nexthop.com>
Message-ID: <Pine.LNX.4.44.0309250808180.30104-100000@netcore.fi>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Sender: idr-admin@ietf.org
Errors-To: idr-admin@ietf.org
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
List-Archive: <https://www1.ietf.org/mail-archive/working-groups/idr/>
Date: Thu, 25 Sep 2003 08:08:42 +0300

On Wed, 24 Sep 2003, Susan Hares wrote:
> For context on the message, could you
> tell me are you:
> 
> 	2) Operator using BGP

Operator.

> -----Original Message-----
> From: Pekka Savola [mailto:pekkas@netcore.fi]
> Sent: Wednesday, September 24, 2003 3:50 AM
> To: idr@ietf.org
> Subject: [Idr] comments on idr-bgp4-20
> 
> 
> Hi,
> 
> [note, I'm not subscribed to the idr list]
> 
> Alex solicited ops-dir for folks to review bgp4-20 spec.  After almost
> half a year, I finally found the time to look at it.  Comments below.  I
> tried to look at the issues list draft-ietf-idr-bgp-issues-01.txt, but at
> least on the quick glance, I didn't find major overlaps.
> 
> Overview:
>  - the spec is in pretty good shape, but I'm not sure if all the features
> listed are really full standard material.. it might be better to separate
> some of them (e.g. anything relating to aggregation) to a separate doc (so
> that we'd have a much tigher and better real base spec), and recycle it at
> DS, but that might be much more effort.. not sure if something like that
> would be worth it.
> 
> General thoughts on going to full standard:
>  - If we want to go for full standard, I have to question the usefulness of
> several BGP features:
>   - AS_SET AS_PATH attribute
>   - whole route aggregation concept (nowadays, it seems to be causing 
> more trouble than it's worth -- instead of proper advertisements, just 
> automatically aggregate instead of fixing the advertisements to begin 
> with).  Maybe pver 10 years ago it may have been relevant if all the edge 
> BGP routers might not support BGP-4 or CIDR..
>   - about N++ different cases of third-party nexthop usage.
> 
> As a general observation from the text:
>  - I skipped section 8 on FSM, as it was really boring, and I'm not sure
> whehter it's of any use keeping it in the base spec.
>  - I got major headache when trying to read the aggregating rules
> especially in section 9.2.2.2 (and that was supposed to be the _simple_
> rules :-(!!!) and ended up skipping them.
> 
> substantial
> -----------
> 
> 1) Section 3 on Summary of the Operations, states:
> 
>    In the context of this document we assume that a BGP speaker adver-
>    tises to its peers only those routes that it itself uses (in this
>    context a BGP speaker is said to "use" a BGP route if it is the most
>    preferred BGP route and is used in forwarding). All other cases are
>    outside the scope of this document.
> 
> this "usability" requirement is in conflict with section 4.3 on UPDATE
> message format:
> 
>    An UPDATE message is used to advertise feasible routes sharing common
>    path attributes to a peer, [...]
> 
> ==> in particular note that "feasible" is a superset of "only those routes
> the router itself uses" (compare to "those routes the router itself _could_
> use if they were the most preferred ones"), right?
> 
> ==> also note that the former definition is (IMHO) operationally unsound.  
> It prevents you from advertising P-t-P addresses and loopbacks (which also 
> reside in IGP) using BGP, because the IGP distance is lower than BGP and 
> "BGP route is never used in forwarding". (depending on how you interpret 
> that).  IMHO, this really needs to be changed to be OK.
> 
> 2) The OPEN message format is confusing.  Why on earth do you have those
> empty bits there?  Are they null-padding or just a typographical convention
> for the clarity of the diagram (the latter)?
>                                                                                                      
>        0                   1                   2                   3
>        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>        +-+-+-+-+-+-+-+-+
>        |    Version    |
>        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>        |     My Autonomous System      |
>        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>        |           Hold Time           |
>        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>        |                         BGP Identifier                        |
>        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>        | Opt Parm Len  |
>        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>        |                                                               |
>        |             Optional Parameters (variable)                    |
>        |                                                               |
>        +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 
> ==> I strongly recommend to either:
>  1) go back to standard format, or
>  2) rephrase the message format similar with UPDATE messages, i.e., drop all
> the pretense about field lengths, bits 0-31, etc.
> 
> 3) there is significant confusion on what "locally originated" means.  It is
> not clear whether this refers to "local to the BGP speaker (i.e., the same
> system)" or "local to the routing domain (i.e., the same AS)".  This must be
> clarified in the text below, and the text should be reviewed.
> 
> 5.1.3 NEXT_HOP
>                                                                                                      
>    The NEXT_HOP is a well-known mandatory attribute that defines the IP
>    address of the router that SHOULD be used as the next hop to the des-
>    tinations listed in the UPDATE message. The NEXT_HOP attribute is
>    calculated as follows.
>                                                                                                      
>       1) When sending a message to an internal peer, if the route is not
>       locally originated the BGP speaker SHOULD NOT modify the NEXT_HOP
>       attribute, unless it has been explicitly configured to announce
>       its own IP address as the NEXT_HOP. When announcing a locally
>       originated route to an internal peer, the BGP speaker SHOULD use
>       as the NEXT_HOP the interface address of the router through which
>       the announced network is reachable for the speaker; if the route
>       is directly connected to the speaker, or the interface address of
>       the router through which the announced network is reachable for
>       the speaker is the internal peer's address, then the BGP speaker
>       SHOULD use for the NEXT_HOP attribute its own IP address (the
>       address of the interface that is used to reach the peer).
> 
> 4) there appears to have been added some bogosities about inter-AS traffic
> engineering in section 9.3:
> 
>       - If the local AS appears in the AS path of the new route being
>       considered, then that new route can not be viewed as better than
>       any other route (provided that the speaker is configured to accept
>       such routes). If such a route were ever used, a routing loop could
>       result.
> 
> and:
> 
>      The ability of a BGP speaker to include more than one instance of
>       its own AS in the AS_PATH attribute for the purpose of inter-AS
>       traffic engineering.
> 
> ==> what do these have to do with base spec?  I don't think they're
> operationally feasible or used.  Remove any indication that it's OK to
> receive your own AS in the routes you receive!!
> 
> 5) the reasoning for this section is not clear.  It seems to be two parts
> (the second one starting at "These routes...").  It seems partially, at
> least for the second part, a set of operational guidelines.. which probably
> don't have anything to do with the BGP spec!
> 
> 9.4 Originating BGP routes
>                                                                                                   
>    A BGP speaker may originate BGP routes by injecting routing informa-
>    tion acquired by some other means (e.g. via an IGP) into BGP. A BGP
>    speaker that originates BGP routes assigns the degree of preference
>    to these routes by passing them through the Decision Process (see
>    Section 9.1). These routes MAY also be distributed to other BGP
>    speakers within the local AS as part of the update process (see Sec-
>    tion 9.2). The decision whether to distribute non-BGP acquired routes
>    within an AS via BGP or not depends on the environment within the AS
>    (e.g. type of IGP) and SHOULD be controlled via configuration.
> 
> 6) references; note that as the Security Consideration section is a bit
> short, the vulnerability analysis should be a *NORMATIVE* reference. 
> However, one must ask here and with RFC2385 as well whether normative
> references are OK from Full Standard to a lower maturity level.  Unless
> this has already been investigated, it may be worth checking with the ADs
> that this is OK.
> 
>    BGP vulnerabilities analysis is discussed in [XXX].
>                                                                                                 
> Normative References
>                                                                                                 
>    [RFC2385] Heffernan, A., "Protection of BGP Sessions via the TCP MD5
>    Signature Option", RFC2385, August 1998.
>                                                                                                 
> Non-normative References
>                                                                                                 
>    [XXX] Murphy, S., "BGP Security Vulnerabilities Analysis", draft-
>    ietf-idr-bgp-vuln-00.txt, work in progress
> 
> 
> semi-substantial
> ----------------
> 
> ==> I'd add "Decision Process", "Routing Table", and "BGP route" in the
> terminology. (Especially the last one is a bit shady, see more comments
> later.)
> 
>    Adj-RIB-In
>    Adj-RIBs-In
>    Adj-RIB-Out
>    Adj-RIBs-Out
> 
>  ==> in some places, one seems to use "Adj-RIBs-{In,Out}", in some others,
> "Adj-RIB-{In,Out}".  The plural/singular form should be consistent.
> 
>    BGP speaker
>       A router that implements BGP.
> 
> ==> uhh, no.  Throughout the text "BGP speaker" is used to refer to
> BGP-*enabled* routers.  Implementation is a required but not sufficient
> condition to that.  Consider rewording to something like:
> 
>    BGP speaker
>       A router that implements and has been configured to use BGP.
> 
> ....
> 
>       BGP Identifier:
>                                                                                                      
>          This 4-octet unsigned integer indicates the BGP Identifier of
>          the sender. A given BGP speaker sets the value of its BGP Iden-
>          tifier to an IP address assigned to that BGP speaker.  The
>          value of the BGP Identifier is determined on startup and is the
>          same for every local interface and every BGP peer.
> 
> ==> the last sentence is confusing "same for every local interface".. yeah,
> right, but interfaces are irrelevant here.  Also, this ignores the fact
> that it's perfectly legal to change BGP identifiers after startup but prior
> to establishing a BGP session.  Consider rewording to something
> like:
> 
>          This 4-octet unsigned integer indicates the BGP Identifier of
>          the sender. A given BGP speaker sets the value of its BGP Iden-
>          tifier to an IP address assigned to that BGP speaker.  The
>          value of the BGP Identifier is determined prior to establishing
>          BGP sessions and is the same for every session the BGP speaker.
> 
> ...
> 
>       Path Attributes:
>                                                                                                      
>          A variable length sequence of path attributes is present in
>          every UPDATE message, except for an UPDATE message that carries
>          only the withdrawn routes. Each path attribute is a triple
>          <attribute type, attribute length, attribute value> of variable
>          length.
> 
> ==> should a simple diagram be added here, just like in section 4.2?
> 
>          The high-order bit (bit 0) of the Attribute Flags octet is the
>          Optional bit. It defines whether the attribute is optional (if
>          set to 1) or well-known (if set to 0).
> 
> ==> this and the following descriptions on attributes are out of place if
> you have not described the remaining categories (because they're used when
> describing the attributes in the next page), that is, discretionary and
> mandatory.  Consider rewording the above to something like:
> 	
>          The high-order bit (bit 0) of the Attribute Flags octet is the
>          Optional bit. It defines whether the attribute is optional (if
>          set to 1) or well-known (if set to 0).  Well-known attributes can
>          be either mandatory (MUST be included in every UPDATE message that
>          contains NLRI), or discretionary (inclusion not required).
> 
> ...
> 
>    All well-known attributes MUST be passed along (after proper updat-
>    ing, if necessary) to other BGP peers.
> 
> ==> LOCAL-PREF is an exception here.  I also note that the BGP's definition
> of "transitive" is different ("unrecognized optional attributes are only
> passed onward when the transitive bit is set, but recognized optional
> attributes are always passed onward") from the apparent language ("always passed
> along"), and might use a bit of clarification.
> 
> ...
> 
>       b) When a given BGP speaker advertises the route to an external
>       peer, then the advertising speaker updates the AS_PATH attribute
>       as follows:
>                                                                                                      
>          1) if the first path segment of the AS_PATH is of type
>          AS_SEQUENCE, the local system prepends its own AS number as the
>          last element of the sequence (put it in the leftmost position).
>          If the act of prepending will cause an overflow in the AS_PATH
>          segment, i.e. more than 255 ASs, it is legal to prepend a new
>          segment of type AS_SEQUENCE and prepend its own AS number to
>          this new segment.
>                                                                                                      
>          2) if the first path segment of the AS_PATH is of type AS_SET,
>          the local system prepends a new path segment of type
>          AS_SEQUENCE to the AS_PATH, including its own AS number in that
>          segment.
>                                                                                                      
> ==> the text here seems confusing because it is not clear whether things are
> added at the beginning or at the end of the segments, for example:
> 
> "the _last_ element sequence (put it in the _leftmost_ position)"
> 
> .. I think this requires more precise language on which end of the segment
> these values are really inserted.  I.e., is it "physically" (in memory)
> left-to-right (from more recent to more distant), or just interpreted that
> way?  This has significant impact on the last sentence of section 1) in case
> of overflows and the like.
>  
>    Whenever the modification of the AS_PATH attribute calls for includ-
>    ing or prepending the AS number of the local system, the local system
>    MAY include/prepend more than one instance of its own AS number in
>    the AS_PATH attribute. This is controlled via local configuration.
> 
> ==> what about prepending/adding someone else's AS?  Is this relevant enough
> to be considered?
> 
>    A BGP speaker that receives a route with the ATOMIC_AGGREGATE
>    attribute MUST NOT make any NLRI of that route more specific (as
>    defined in 9.1.4) when advertising this route to other BGP speakers.
> 
> ==> the second line seems out-of-place here.  I assume it means that the
> router must not perform automatic de-aggregation, not that the router must
> not have more specific routes to advertise.  In any case, a rewording might
> be in order, e.g. like:
> 
>    A BGP speaker that receives a route with the ATOMIC_AGGREGATE
>    attribute MUST NOT de-aggregate routes (as defined in 9.1.4); that is, 
>    generating more specifics by de-aggregating a route is forbidden.
> 
> (note that this actually makes the statement stronger, and more in line of
> 9.1.4, that you should not do it even locally.)
> 
>    If the Length field of the message header is less than 19 or greater
>    than 4096, or if the Length field of an OPEN message is less than the
>    minimum length of the OPEN message, or if the Length field of an
>    UPDATE message is less than the minimum length of the UPDATE message,
>    or if the Length field of a KEEPALIVE message is not equal to 19, or
>    if the Length field of a NOTIFICATION message is less than the mini-
>    mum length of the NOTIFICATION message, then the Error Subcode is set
>    to Bad Message Length. The Data field contains the erroneous Length
>    field.
> 
> ==> convert this to a bulleted list or something, so it's easier to read,
> please! :-)
> 
>    The IP address in the NEXT_HOP MUST meet the following criteria to be
>    considered semantically correct:
>                                                                                                   
>       a) It MUST NOT be the IP address of the receiving speaker
>                                                                                                   
>       b) In the case of an EBGP where the sender and receiver are one IP
>       hop away from each other, either the IP address in the NEXT_HOP
>       MUST be the sender's IP address (that is used to establish the BGP
>       connection), or the interface associated with the NEXT_HOP IP
>       address MUST share a common subnet with the receiving BGP speaker.
> 
> 
> ==> what's exactly the case for the first case of b)?  Couldn't thid be
> cleaned a bit (it seems to me that the second case is a superset of the
> first)?  In addition, I'd recommend rewording "one IP hop away" --
> that begs the question whether you mean "direct neighbors" or "just one
> transit router".  When thinking of it more, the former seems apparent, but a
> different wording might flesh this out better.
> 
>    If a BGP speaker chooses to aggregate, then it SHOULD either include
>    all AS used to form the aggreagate in an AS_SET or add the
>    ATOMIC_AGGREGATE attribute to the route.  This attribute is now pri-
>    marily informational.  With the elimination of IP routing protocols
>    that do not support classless routing and the elimination of router
>    and host implementations that do not support classless routing, there
>    is no longer a need to deaggregate.  Routes SHOULD NOT be de-aggre-
>    gated.  A route that carries ATOMIC_AGGREGATE attribute in particular
>    MUST NOT be de-aggregated. That is, the NLRI of this route can not be
>    made more specific. Forwarding along such a route does not guarantee
>    that IP packets will actually traverse only ASs listed in the AS_PATH
>    attribute of the route.
> 
> ==> this is confusing, especially the end, beginning from "That is, ..." and
> should be reworded a bit (does it cover only deaggregation or more specifics
> in general?), like:
> 
>    If a BGP speaker chooses to aggregate, then it SHOULD either include
>    all AS used to form the aggreagate in an AS_SET or add the
>    ATOMIC_AGGREGATE attribute to the route.  This attribute is now pri-
>    marily informational.  With the elimination of IP routing protocols
>    that do not support classless routing and the elimination of router
>    and host implementations that do not support classless routing, there
>    is no longer a need to deaggregate.  
> 
>    Routes SHOULD NOT be de-aggregated.  A route that carries ATOMIC_AGGREGATE 
>    attribute in particular MUST NOT be de-aggregated. That is, one must not
>    generate more specifics of the NLRI of this route. Forwarding along 
>    such a route does not guarantee
>    that IP packets will actually traverse only ASs listed in the AS_PATH
>    attribute of the route.  However, accepting more specific advertisments
>    which an aggregate route covers is acceptable.
> 
> ...
> 
>    A BGP speaker SHOULD NOT advertise a given feasible BGP route from
>    its Adj-RIB-Out if it would produce an UPDATE message containing the
>    same BGP route as was previously advertised.
> 
> ==> and now we get back to the definition of "same BGP route"?  Identical prefix
> carried over BGP?  Identical prefix carried over BGP with identical attributes?"
> I'm assuming the former, otherwise (at least!) the above should be
> clarified.
> 
>    Aggregation reduces the amount of information that a BGP speaker must
>    store and exchange with other BGP speakers. Routes can be aggregated
>    by applying the following procedure separately to path attributes of
>    like type and to the Network Layer Reachability Information.
> 
> ==> the last line sounds really weird; consider:
>  - s/like/the same/ ?
>  - add more text after the NLRI, like "NLRI field" ?
> 
> 
> 
> typos or other editorials
> -------------------------
> 
> ==> there should not be hyphenation at the linebreaks.
> 
>    This document specifies the base behavior of the BGP protocol. This
>    behavior can and is modified by extention specifications.  When the
>    protocol is extended the new behavior is fully documented in the
>    extention specifications.
> 
> ==> s/extention/extension/ (two places)
> 
> 3.2 Routing Information Bases
>                                                                                                      
>   The Routing Information Base (RIB) within a BGP speaker consists of
>    three distinct parts:
> 
> ==> the subject is in plural form ("RIBs"), but the text begins as singular
> ("The RIB").  Consider which one to use?
> 
>          a)   ORIGIN (Type Code 1):
>                                                                                                      
>          b) AS_PATH (Type Code 2):
>                                                                                                      
> ==> kill the extra spaces before ORIGIN and NEXT_HOP.
> 
>          1) if the first path segment of the AS_PATH is of type
>          AS_SEQUENCE, the local system prepends its own AS number as the
>          last element of the sequence (put it in the leftmost position).
>  
> ==> rewrite the last one as:
>          last element of the sequence, putting it in the leftmost position.
> 
>    A BGP speaker MUST IMPLEMENT a mechanism based on local configuration
>  
> ==> s/IMPLEMENT/implement/
> 
>             used by a BGP speaker's decision process to discriminate
>  
> ==> s/decision process/Decision Process/ (similar in a few other places?)
> 
>    attribute needs to be cognizant of the fact that the actual path to
>  
> ==> s/cognizant/aware/
> 
> 6. BGP Error Handling.
>                                                                                                   
> ==> remove the periods from at the end of section titles, esp. in subctions
> of section 6.
> 
>        issue a open collision dump [Event 23].  When the local
>        system receives a open collision dump event [Event 23], the
>  
> ==> s/a open/an open/ (two times)
> 
>    installed in LocRib and will be excluded from the next phase of route
>   
> ==> s/LocRib/Loc-RIB/
> 
>    all AS used to form the aggreagate in an AS_SET or add the
>  
> ==> s/aggreagate/aggregate/
> 
>    is no longer a need to deaggregate.  Routes SHOULD NOT be de-aggre-
>    gated.  A route that carries ATOMIC_AGGREGATE attribute in particular
>    MUST NOT be de-aggregated. That is, the NLRI of this route can not be
>  
> ==> pick either "deaggregate" or "de-aggregate" and stick to that.
> 
>          have the origin attribute with the value EGP. In all other case
>  
> ==> s/case/cases/
> 
> 
> 

-- 
Pekka Savola                 "You each name yourselves king, yet the
Netcore Oy                    kingdom bleeds."
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings


_______________________________________________
Idr mailing list
Idr@ietf.org
https://www1.ietf.org/mailman/listinfo/idr