[Idr] comments on idr-bgp4-20

Pekka Savola <pekkas@netcore.fi> Wed, 24 September 2003 07:52 UTC

Received: from optimus.ietf.org (ietf.org [132.151.1.19] (may be forged)) by ietf.org (8.9.1a/8.9.1a) with ESMTP id DAA00129; Wed, 24 Sep 2003 03:52:44 -0400 (EDT)
Received: from localhost.localdomain ([127.0.0.1] helo=www1.ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1A24RN-0003Hb-OC; Wed, 24 Sep 2003 03:52:01 -0400
Received: from odin.ietf.org ([132.151.1.176] helo=ietf.org) by optimus.ietf.org with esmtp (Exim 4.20) id 1A24QQ-0003G6-Nb for idr@optimus.ietf.org; Wed, 24 Sep 2003 03:51:02 -0400
Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id DAA29717 for <idr@ietf.org>; Wed, 24 Sep 2003 03:50:55 -0400 (EDT)
Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 1A24QO-0007I7-00 for idr@ietf.org; Wed, 24 Sep 2003 03:51:00 -0400
Received: from netcore.fi ([193.94.160.1]) by ietf-mx with esmtp (Exim 4.12) id 1A24QM-0007GO-00 for idr@ietf.org; Wed, 24 Sep 2003 03:50:58 -0400
Received: from localhost (pekkas@localhost) by netcore.fi (8.11.6/8.11.6) with ESMTP id h8O7oRn08766; Wed, 24 Sep 2003 10:50:28 +0300
From: Pekka Savola <pekkas@netcore.fi>
To: idr@ietf.org
Message-ID: <Pine.LNX.4.44.0309241035300.8509-100000@netcore.fi>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset="US-ASCII"
Subject: [Idr] comments on idr-bgp4-20
Sender: idr-admin@ietf.org
Errors-To: idr-admin@ietf.org
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.0.12
Precedence: bulk
List-Unsubscribe: <https://www1.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www1.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
List-Archive: <https://www1.ietf.org/mail-archive/working-groups/idr/>
Date: Wed, 24 Sep 2003 10:50:26 +0300

Hi,

[note, I'm not subscribed to the idr list]

Alex solicited ops-dir for folks to review bgp4-20 spec.  After almost
half a year, I finally found the time to look at it.  Comments below.  I
tried to look at the issues list draft-ietf-idr-bgp-issues-01.txt, but at
least on the quick glance, I didn't find major overlaps.

Overview:
 - the spec is in pretty good shape, but I'm not sure if all the features
listed are really full standard material.. it might be better to separate
some of them (e.g. anything relating to aggregation) to a separate doc (so
that we'd have a much tigher and better real base spec), and recycle it at
DS, but that might be much more effort.. not sure if something like that
would be worth it.

General thoughts on going to full standard:
 - If we want to go for full standard, I have to question the usefulness of
several BGP features:
  - AS_SET AS_PATH attribute
  - whole route aggregation concept (nowadays, it seems to be causing 
more trouble than it's worth -- instead of proper advertisements, just 
automatically aggregate instead of fixing the advertisements to begin 
with).  Maybe pver 10 years ago it may have been relevant if all the edge 
BGP routers might not support BGP-4 or CIDR..
  - about N++ different cases of third-party nexthop usage.

As a general observation from the text:
 - I skipped section 8 on FSM, as it was really boring, and I'm not sure
whehter it's of any use keeping it in the base spec.
 - I got major headache when trying to read the aggregating rules
especially in section 9.2.2.2 (and that was supposed to be the _simple_
rules :-(!!!) and ended up skipping them.

substantial
-----------

1) Section 3 on Summary of the Operations, states:

   In the context of this document we assume that a BGP speaker adver-
   tises to its peers only those routes that it itself uses (in this
   context a BGP speaker is said to "use" a BGP route if it is the most
   preferred BGP route and is used in forwarding). All other cases are
   outside the scope of this document.

this "usability" requirement is in conflict with section 4.3 on UPDATE
message format:

   An UPDATE message is used to advertise feasible routes sharing common
   path attributes to a peer, [...]

==> in particular note that "feasible" is a superset of "only those routes
the router itself uses" (compare to "those routes the router itself _could_
use if they were the most preferred ones"), right?

==> also note that the former definition is (IMHO) operationally unsound.  
It prevents you from advertising P-t-P addresses and loopbacks (which also 
reside in IGP) using BGP, because the IGP distance is lower than BGP and 
"BGP route is never used in forwarding". (depending on how you interpret 
that).  IMHO, this really needs to be changed to be OK.

2) The OPEN message format is confusing.  Why on earth do you have those
empty bits there?  Are they null-padding or just a typographical convention
for the clarity of the diagram (the latter)?
                                                                                                     
       0                   1                   2                   3
       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+
       |    Version    |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |     My Autonomous System      |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |           Hold Time           |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                         BGP Identifier                        |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       | Opt Parm Len  |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |                                                               |
       |             Optional Parameters (variable)                    |
       |                                                               |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

==> I strongly recommend to either:
 1) go back to standard format, or
 2) rephrase the message format similar with UPDATE messages, i.e., drop all
the pretense about field lengths, bits 0-31, etc.

3) there is significant confusion on what "locally originated" means.  It is
not clear whether this refers to "local to the BGP speaker (i.e., the same
system)" or "local to the routing domain (i.e., the same AS)".  This must be
clarified in the text below, and the text should be reviewed.

5.1.3 NEXT_HOP
                                                                                                     
   The NEXT_HOP is a well-known mandatory attribute that defines the IP
   address of the router that SHOULD be used as the next hop to the des-
   tinations listed in the UPDATE message. The NEXT_HOP attribute is
   calculated as follows.
                                                                                                     
      1) When sending a message to an internal peer, if the route is not
      locally originated the BGP speaker SHOULD NOT modify the NEXT_HOP
      attribute, unless it has been explicitly configured to announce
      its own IP address as the NEXT_HOP. When announcing a locally
      originated route to an internal peer, the BGP speaker SHOULD use
      as the NEXT_HOP the interface address of the router through which
      the announced network is reachable for the speaker; if the route
      is directly connected to the speaker, or the interface address of
      the router through which the announced network is reachable for
      the speaker is the internal peer's address, then the BGP speaker
      SHOULD use for the NEXT_HOP attribute its own IP address (the
      address of the interface that is used to reach the peer).

4) there appears to have been added some bogosities about inter-AS traffic
engineering in section 9.3:

      - If the local AS appears in the AS path of the new route being
      considered, then that new route can not be viewed as better than
      any other route (provided that the speaker is configured to accept
      such routes). If such a route were ever used, a routing loop could
      result.

and:

     The ability of a BGP speaker to include more than one instance of
      its own AS in the AS_PATH attribute for the purpose of inter-AS
      traffic engineering.

==> what do these have to do with base spec?  I don't think they're
operationally feasible or used.  Remove any indication that it's OK to
receive your own AS in the routes you receive!!

5) the reasoning for this section is not clear.  It seems to be two parts
(the second one starting at "These routes...").  It seems partially, at
least for the second part, a set of operational guidelines.. which probably
don't have anything to do with the BGP spec!

9.4 Originating BGP routes
                                                                                                  
   A BGP speaker may originate BGP routes by injecting routing informa-
   tion acquired by some other means (e.g. via an IGP) into BGP. A BGP
   speaker that originates BGP routes assigns the degree of preference
   to these routes by passing them through the Decision Process (see
   Section 9.1). These routes MAY also be distributed to other BGP
   speakers within the local AS as part of the update process (see Sec-
   tion 9.2). The decision whether to distribute non-BGP acquired routes
   within an AS via BGP or not depends on the environment within the AS
   (e.g. type of IGP) and SHOULD be controlled via configuration.

6) references; note that as the Security Consideration section is a bit
short, the vulnerability analysis should be a *NORMATIVE* reference. 
However, one must ask here and with RFC2385 as well whether normative
references are OK from Full Standard to a lower maturity level.  Unless
this has already been investigated, it may be worth checking with the ADs
that this is OK.

   BGP vulnerabilities analysis is discussed in [XXX].
                                                                                                
Normative References
                                                                                                
   [RFC2385] Heffernan, A., "Protection of BGP Sessions via the TCP MD5
   Signature Option", RFC2385, August 1998.
                                                                                                
Non-normative References
                                                                                                
   [XXX] Murphy, S., "BGP Security Vulnerabilities Analysis", draft-
   ietf-idr-bgp-vuln-00.txt, work in progress


semi-substantial
----------------

==> I'd add "Decision Process", "Routing Table", and "BGP route" in the
terminology. (Especially the last one is a bit shady, see more comments
later.)

   Adj-RIB-In
   Adj-RIBs-In
   Adj-RIB-Out
   Adj-RIBs-Out

 ==> in some places, one seems to use "Adj-RIBs-{In,Out}", in some others,
"Adj-RIB-{In,Out}".  The plural/singular form should be consistent.

   BGP speaker
      A router that implements BGP.

==> uhh, no.  Throughout the text "BGP speaker" is used to refer to
BGP-*enabled* routers.  Implementation is a required but not sufficient
condition to that.  Consider rewording to something like:

   BGP speaker
      A router that implements and has been configured to use BGP.

....

      BGP Identifier:
                                                                                                     
         This 4-octet unsigned integer indicates the BGP Identifier of
         the sender. A given BGP speaker sets the value of its BGP Iden-
         tifier to an IP address assigned to that BGP speaker.  The
         value of the BGP Identifier is determined on startup and is the
         same for every local interface and every BGP peer.

==> the last sentence is confusing "same for every local interface".. yeah,
right, but interfaces are irrelevant here.  Also, this ignores the fact
that it's perfectly legal to change BGP identifiers after startup but prior
to establishing a BGP session.  Consider rewording to something
like:

         This 4-octet unsigned integer indicates the BGP Identifier of
         the sender. A given BGP speaker sets the value of its BGP Iden-
         tifier to an IP address assigned to that BGP speaker.  The
         value of the BGP Identifier is determined prior to establishing
         BGP sessions and is the same for every session the BGP speaker.

...

      Path Attributes:
                                                                                                     
         A variable length sequence of path attributes is present in
         every UPDATE message, except for an UPDATE message that carries
         only the withdrawn routes. Each path attribute is a triple
         <attribute type, attribute length, attribute value> of variable
         length.

==> should a simple diagram be added here, just like in section 4.2?

         The high-order bit (bit 0) of the Attribute Flags octet is the
         Optional bit. It defines whether the attribute is optional (if
         set to 1) or well-known (if set to 0).

==> this and the following descriptions on attributes are out of place if
you have not described the remaining categories (because they're used when
describing the attributes in the next page), that is, discretionary and
mandatory.  Consider rewording the above to something like:
	
         The high-order bit (bit 0) of the Attribute Flags octet is the
         Optional bit. It defines whether the attribute is optional (if
         set to 1) or well-known (if set to 0).  Well-known attributes can
         be either mandatory (MUST be included in every UPDATE message that
         contains NLRI), or discretionary (inclusion not required).

...

   All well-known attributes MUST be passed along (after proper updat-
   ing, if necessary) to other BGP peers.

==> LOCAL-PREF is an exception here.  I also note that the BGP's definition
of "transitive" is different ("unrecognized optional attributes are only
passed onward when the transitive bit is set, but recognized optional
attributes are always passed onward") from the apparent language ("always passed
along"), and might use a bit of clarification.

...

      b) When a given BGP speaker advertises the route to an external
      peer, then the advertising speaker updates the AS_PATH attribute
      as follows:
                                                                                                     
         1) if the first path segment of the AS_PATH is of type
         AS_SEQUENCE, the local system prepends its own AS number as the
         last element of the sequence (put it in the leftmost position).
         If the act of prepending will cause an overflow in the AS_PATH
         segment, i.e. more than 255 ASs, it is legal to prepend a new
         segment of type AS_SEQUENCE and prepend its own AS number to
         this new segment.
                                                                                                     
         2) if the first path segment of the AS_PATH is of type AS_SET,
         the local system prepends a new path segment of type
         AS_SEQUENCE to the AS_PATH, including its own AS number in that
         segment.
                                                                                                     
==> the text here seems confusing because it is not clear whether things are
added at the beginning or at the end of the segments, for example:

"the _last_ element sequence (put it in the _leftmost_ position)"

.. I think this requires more precise language on which end of the segment
these values are really inserted.  I.e., is it "physically" (in memory)
left-to-right (from more recent to more distant), or just interpreted that
way?  This has significant impact on the last sentence of section 1) in case
of overflows and the like.
 
   Whenever the modification of the AS_PATH attribute calls for includ-
   ing or prepending the AS number of the local system, the local system
   MAY include/prepend more than one instance of its own AS number in
   the AS_PATH attribute. This is controlled via local configuration.

==> what about prepending/adding someone else's AS?  Is this relevant enough
to be considered?

   A BGP speaker that receives a route with the ATOMIC_AGGREGATE
   attribute MUST NOT make any NLRI of that route more specific (as
   defined in 9.1.4) when advertising this route to other BGP speakers.

==> the second line seems out-of-place here.  I assume it means that the
router must not perform automatic de-aggregation, not that the router must
not have more specific routes to advertise.  In any case, a rewording might
be in order, e.g. like:

   A BGP speaker that receives a route with the ATOMIC_AGGREGATE
   attribute MUST NOT de-aggregate routes (as defined in 9.1.4); that is, 
   generating more specifics by de-aggregating a route is forbidden.

(note that this actually makes the statement stronger, and more in line of
9.1.4, that you should not do it even locally.)

   If the Length field of the message header is less than 19 or greater
   than 4096, or if the Length field of an OPEN message is less than the
   minimum length of the OPEN message, or if the Length field of an
   UPDATE message is less than the minimum length of the UPDATE message,
   or if the Length field of a KEEPALIVE message is not equal to 19, or
   if the Length field of a NOTIFICATION message is less than the mini-
   mum length of the NOTIFICATION message, then the Error Subcode is set
   to Bad Message Length. The Data field contains the erroneous Length
   field.

==> convert this to a bulleted list or something, so it's easier to read,
please! :-)

   The IP address in the NEXT_HOP MUST meet the following criteria to be
   considered semantically correct:
                                                                                                  
      a) It MUST NOT be the IP address of the receiving speaker
                                                                                                  
      b) In the case of an EBGP where the sender and receiver are one IP
      hop away from each other, either the IP address in the NEXT_HOP
      MUST be the sender's IP address (that is used to establish the BGP
      connection), or the interface associated with the NEXT_HOP IP
      address MUST share a common subnet with the receiving BGP speaker.


==> what's exactly the case for the first case of b)?  Couldn't thid be
cleaned a bit (it seems to me that the second case is a superset of the
first)?  In addition, I'd recommend rewording "one IP hop away" --
that begs the question whether you mean "direct neighbors" or "just one
transit router".  When thinking of it more, the former seems apparent, but a
different wording might flesh this out better.

   If a BGP speaker chooses to aggregate, then it SHOULD either include
   all AS used to form the aggreagate in an AS_SET or add the
   ATOMIC_AGGREGATE attribute to the route.  This attribute is now pri-
   marily informational.  With the elimination of IP routing protocols
   that do not support classless routing and the elimination of router
   and host implementations that do not support classless routing, there
   is no longer a need to deaggregate.  Routes SHOULD NOT be de-aggre-
   gated.  A route that carries ATOMIC_AGGREGATE attribute in particular
   MUST NOT be de-aggregated. That is, the NLRI of this route can not be
   made more specific. Forwarding along such a route does not guarantee
   that IP packets will actually traverse only ASs listed in the AS_PATH
   attribute of the route.

==> this is confusing, especially the end, beginning from "That is, ..." and
should be reworded a bit (does it cover only deaggregation or more specifics
in general?), like:

   If a BGP speaker chooses to aggregate, then it SHOULD either include
   all AS used to form the aggreagate in an AS_SET or add the
   ATOMIC_AGGREGATE attribute to the route.  This attribute is now pri-
   marily informational.  With the elimination of IP routing protocols
   that do not support classless routing and the elimination of router
   and host implementations that do not support classless routing, there
   is no longer a need to deaggregate.  

   Routes SHOULD NOT be de-aggregated.  A route that carries ATOMIC_AGGREGATE 
   attribute in particular MUST NOT be de-aggregated. That is, one must not
   generate more specifics of the NLRI of this route. Forwarding along 
   such a route does not guarantee
   that IP packets will actually traverse only ASs listed in the AS_PATH
   attribute of the route.  However, accepting more specific advertisments
   which an aggregate route covers is acceptable.

...

   A BGP speaker SHOULD NOT advertise a given feasible BGP route from
   its Adj-RIB-Out if it would produce an UPDATE message containing the
   same BGP route as was previously advertised.

==> and now we get back to the definition of "same BGP route"?  Identical prefix
carried over BGP?  Identical prefix carried over BGP with identical attributes?"
I'm assuming the former, otherwise (at least!) the above should be
clarified.

   Aggregation reduces the amount of information that a BGP speaker must
   store and exchange with other BGP speakers. Routes can be aggregated
   by applying the following procedure separately to path attributes of
   like type and to the Network Layer Reachability Information.

==> the last line sounds really weird; consider:
 - s/like/the same/ ?
 - add more text after the NLRI, like "NLRI field" ?



typos or other editorials
-------------------------

==> there should not be hyphenation at the linebreaks.

   This document specifies the base behavior of the BGP protocol. This
   behavior can and is modified by extention specifications.  When the
   protocol is extended the new behavior is fully documented in the
   extention specifications.

==> s/extention/extension/ (two places)

3.2 Routing Information Bases
                                                                                                     
  The Routing Information Base (RIB) within a BGP speaker consists of
   three distinct parts:

==> the subject is in plural form ("RIBs"), but the text begins as singular
("The RIB").  Consider which one to use?

         a)   ORIGIN (Type Code 1):
                                                                                                     
         b) AS_PATH (Type Code 2):
                                                                                                     
==> kill the extra spaces before ORIGIN and NEXT_HOP.

         1) if the first path segment of the AS_PATH is of type
         AS_SEQUENCE, the local system prepends its own AS number as the
         last element of the sequence (put it in the leftmost position).
 
==> rewrite the last one as:
         last element of the sequence, putting it in the leftmost position.

   A BGP speaker MUST IMPLEMENT a mechanism based on local configuration
 
==> s/IMPLEMENT/implement/

            used by a BGP speaker's decision process to discriminate
 
==> s/decision process/Decision Process/ (similar in a few other places?)

   attribute needs to be cognizant of the fact that the actual path to
 
==> s/cognizant/aware/

6. BGP Error Handling.
                                                                                                  
==> remove the periods from at the end of section titles, esp. in subctions
of section 6.

       issue a open collision dump [Event 23].  When the local
       system receives a open collision dump event [Event 23], the
 
==> s/a open/an open/ (two times)

   installed in LocRib and will be excluded from the next phase of route
  
==> s/LocRib/Loc-RIB/

   all AS used to form the aggreagate in an AS_SET or add the
 
==> s/aggreagate/aggregate/

   is no longer a need to deaggregate.  Routes SHOULD NOT be de-aggre-
   gated.  A route that carries ATOMIC_AGGREGATE attribute in particular
   MUST NOT be de-aggregated. That is, the NLRI of this route can not be
 
==> pick either "deaggregate" or "de-aggregate" and stick to that.

         have the origin attribute with the value EGP. In all other case
 
==> s/case/cases/


-- 
Pekka Savola                 "You each name yourselves king, yet the
Netcore Oy                    kingdom bleeds."
Systems. Networks. Security. -- George R.R. Martin: A Clash of Kings




_______________________________________________
Idr mailing list
Idr@ietf.org
https://www1.ietf.org/mailman/listinfo/idr