IDR Agenda Items for Vienna

Yakov Rekhter <yakov@juniper.net> Tue, 27 May 2003 16:11 UTC

Received: from ietf-mx (ietf-mx.ietf.org [132.151.6.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id MAA18390 for <idr-archive@ietf.org>; Tue, 27 May 2003 12:11:25 -0400 (EDT)
Received: from ietf-mx ([132.151.6.1]) by ietf-mx with esmtp (Exim 4.12) id 19Kh1N-0005on-00 for idr-archive@ietf.org; Tue, 27 May 2003 12:09:53 -0400
Received: from trapdoor.merit.edu ([198.108.1.26] ident=postfix) by ietf-mx with esmtp (Exim 4.12) id 19Kh1L-0005oD-00 for idr-archive@ietf.org; Tue, 27 May 2003 12:09:52 -0400
Received: by trapdoor.merit.edu (Postfix) id 546EA91217; Tue, 27 May 2003 12:10:46 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 2C2C691218; Tue, 27 May 2003 12:10:46 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id DF7E891217 for <idr@trapdoor.merit.edu>; Tue, 27 May 2003 12:10:44 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id C4B655DEB0; Tue, 27 May 2003 12:10:44 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from merlot.juniper.net (natint.juniper.net [207.17.136.129]) by segue.merit.edu (Postfix) with ESMTP id 47B675DE2C for <idr@merit.edu>; Tue, 27 May 2003 12:10:44 -0400 (EDT)
Received: from juniper.net (garnet.juniper.net [172.17.28.17]) by merlot.juniper.net (8.11.3/8.11.3) with ESMTP id h4RGAhu21177; Tue, 27 May 2003 09:10:43 -0700 (PDT) (envelope-from yakov@juniper.net)
Message-Id: <200305271610.h4RGAhu21177@merlot.juniper.net>
To: idr@merit.edu
Cc: skh@nexthop.com
Subject: IDR Agenda Items for Vienna
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-ID: <96682.1054051843.1@juniper.net>
Date: Tue, 27 May 2003 09:10:43 -0700
From: Yakov Rekhter <yakov@juniper.net>
Sender: owner-idr@merit.edu
Precedence: bulk

Folks,
 
Its about time to start thinking about agenda items for the Vienna
IETF. Please forward any IDR agenda items you might have to me and Sue.

Sue & Yakov.



Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id MAA26892 for <idr-archive@nic.merit.edu>; Tue, 27 May 2003 12:11:05 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 546EA91217; Tue, 27 May 2003 12:10:46 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 2C2C691218; Tue, 27 May 2003 12:10:46 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id DF7E891217 for <idr@trapdoor.merit.edu>; Tue, 27 May 2003 12:10:44 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id C4B655DEB0; Tue, 27 May 2003 12:10:44 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from merlot.juniper.net (natint.juniper.net [207.17.136.129]) by segue.merit.edu (Postfix) with ESMTP id 47B675DE2C for <idr@merit.edu>; Tue, 27 May 2003 12:10:44 -0400 (EDT)
Received: from juniper.net (garnet.juniper.net [172.17.28.17]) by merlot.juniper.net (8.11.3/8.11.3) with ESMTP id h4RGAhu21177; Tue, 27 May 2003 09:10:43 -0700 (PDT) (envelope-from yakov@juniper.net)
Message-Id: <200305271610.h4RGAhu21177@merlot.juniper.net>
To: idr@merit.edu
Cc: skh@nexthop.com
Subject: IDR Agenda Items for Vienna 
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-ID: <96682.1054051843.1@juniper.net>
Date: Tue, 27 May 2003 09:10:43 -0700
From: Yakov Rekhter <yakov@juniper.net>
Sender: owner-idr@merit.edu
Precedence: bulk

Folks,
 
Its about time to start thinking about agenda items for the Vienna
IETF. Please forward any IDR agenda items you might have to me and Sue.

Sue & Yakov.


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id LAA25304 for <idr-archive@nic.merit.edu>; Tue, 27 May 2003 11:16:51 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 35AA391207; Tue, 27 May 2003 11:16:34 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id EB56191213; Tue, 27 May 2003 11:16:33 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 61EC891207 for <idr@trapdoor.merit.edu>; Tue, 27 May 2003 11:16:31 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 43A7C5DE16; Tue, 27 May 2003 11:16:31 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from merlot.juniper.net (natint.juniper.net [207.17.136.129]) by segue.merit.edu (Postfix) with ESMTP id 7A0E65DDF9 for <idr@merit.edu>; Tue, 27 May 2003 11:16:30 -0400 (EDT)
Received: from juniper.net (garnet.juniper.net [172.17.28.17]) by merlot.juniper.net (8.11.3/8.11.3) with ESMTP id h4RFGTu17871; Tue, 27 May 2003 08:16:29 -0700 (PDT) (envelope-from yakov@juniper.net)
Message-Id: <200305271516.h4RFGTu17871@merlot.juniper.net>
To: Alex Zinin <zinin@psg.com>
Cc: idr@merit.edu, rtg-dir@ietf.org
Subject: Re: AD-review comments on draft-ietf-idr-bgp4-20 
In-Reply-To: Your message of "Mon, 05 May 2003 16:38:15 PDT." <177177649135.20030505163815@psg.com> 
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-ID: <77283.1054048589.1@juniper.net>
Date: Tue, 27 May 2003 08:16:29 -0700
From: Yakov Rekhter <yakov@juniper.net>
Sender: owner-idr@merit.edu
Precedence: bulk

Alex,

> Folks,
> 
>  Please find below my AD-review comments. Hopefully they will help
>  improve the document. I tried to consult Andrew's list as much as
>  possible, but do feel free to point out if something has already been
>  discussed and agreed upon.
>  
>  Thanks go to Yakov for kicking me often enough ;)
> --
> Alex Zinin
> 
> Some nits:
> - run it by a spelling checker, please
> - disable hyphenation if possible
> - include boilerplates for IPR notice, Copyright notice

Sure.

> 
> General comment:
> 
>   in some places I highlighted the fact that required behavior is not
>   described using the 2119 language, so it is not clear if a MUST or
>   SHOULD or MAY is applicable. I am sure I've missed some more places
>   like this. I'd like to ask the editors to go through the doc and
>   check this.

Sure.

> > Status of this Memo
> > 
> > 
> ...
> >    The list of Internet-Draft Shadow Directories can be accessed at
> >    http://www.ietf.org/shadow.html.
> >
> > Specification of Requirements
> 
> Nit: move Abstract here. Move requirements after the Acks.

Ok.

> > Abstract
> 
> Should the Abstract say that this spec covers IPv4 only?

Sure.

> > 3. Summary of Operation
> ...
> >    This document uses the term `Autonomous System' (AS) throughout.  The
> >    classic definition of an Autonomous System is a set of routers under
> >    a single technical administration, using an interior gateway protocol
> >    (IGP) and common metrics to determine how to route packets within the
> >    AS, and using an inter-AS routing protocol to determine how to route
> >    packets to other ASs. Since this classic definition was developed, it
> >    has become common for a single AS to use several IGPs and sometimes
> >    several sets of metrics within an AS. The use of the term Autonomous
> >    System here stresses the fact that, even when multiple IGPs and met-
> >    rics are used, the administration of an AS appears to other ASs to
> >    have a single coherent interior routing plan and presents a consis-
> >    tent picture of what destinations are reachable through it.
> 
> Ed: Since 'AS' has been defined before, do we need to repeat the
> definition here?

The definition section before presents a *summary* of the definitions
used in the document. I think that the text reads fine as is, so
I would prefer not to change it.

> ...
> >    peer in the same AS is referred to as an internal peer. Internal BGP
> >    and external BGP are commonly abbreviated IBGP and EBGP.
> 
> Ed: These two have been defined before too

See my previous comment.

> ...
> > Care must be taken to
> >    ensure that the interior routers have all been updated with transit
> >    information before the BGP speakers announce to other ASs that tran-
> >    sit service is being provided.
> 
> What does the last sentence really mean from the implementation
> perspective? It used to mean the BGP/IGP synchronization check. Now
> that iBGP everywhere is assumed, how do we check this condition?

In the absence of any objections by June 10 I suggest to take this 
sentence out.

> >    This document specifies the base behavior of the BGP protocol. This
> >    behavior can and is modified by extention specifications.  When the
> Ed: "extension"

Sure.

> >    protocol is extended the new behavior is fully documented in the
> >    extention specifications.
> Ed: "extension"

Sure.

> 
> > 3.1 Routes: Advertisement and Storage
> > 
> >    For the purpose of this protocol, a route is defined as a unit of
> >    information that pairs a set of destinations with the attributes of a
> >    path to those destinations. The set of destinations are systems whose
> >    IP addresses are contained in one IP address prefix carried in the
> >    Network Layer Reachability Information (NLRI) field of an UPDATE mes-
> >    sage, and the path is the information reported in the path attributes
> >    field of the same UPDATE message.
> Ed: Repeated definition again

See above.

> ...
> >    If a BGP speaker chooses to advertise the route, it MAY add to or
> >    modify the path attributes of the route before advertising it to a
> >    peer.
> 
> The intent here is to say that it's ok to modify the attribute set of
> a previously received route when it's announced further. The way it
> reads though is that self-originated routes are also within the
> context and MAY sounds like you don't have to add attributes when
> announcing those.

I will replace "If a BGP speaker chooses to advertise the route" with
"If a BGP speaker chooses to advertise a previously received route".

> 
> ...
> 
> >    Changing attribute of a route is accomplished by advertising a
> >    replacement route. The replacement route carries new (changed)
> >    attributes and has the same NLRI as the original route.
> 
> "same NLRI" implies the same prefix, but not the NLRI field, which can
> be different (containing other routes), should the use of this term be
> normalized throughout the document?

I will replace "the same NLRI" with "the same address prefix".

> 
> > 4.2 OPEN Message Format
> > 
> >    After a TCP is established, the first message sent by each side is an
> 
> "TCP connection"

ok.

> > 5. Path Attributes
> ...
> >    If a path with recognized transitive optional attribute is accepted
> >    and passed along to other BGP peers and the Partial bit in the
> >    Attribute Flags octet is set to 1 by some previous AS, it is not 
> 
> 'MUST NOT' here?

Sure.

> > set
> >    back to 0 by the current AS. Unrecognized non-transitive optional
> >    attributes MUST be quietly ignored and not passed along to other BGP
> >    peers.
> ...
> >    The same attribute (attribute with the same type) can not appear more
> >    than once within the Path Attributes field of a particular UPDATE
> >    message.
> 
> What should an implementation do if this happens?

See section 6.3:

  If any attribute appears more than once in the UPDATE message, then
  the Error Subcode is set to Malformed Attribute List.

  
> >    The mandatory category refers to an attribute which MUST be present
> >    in both IBGP and EBGP exchanges if NLRI are contained in the UPDATE
> 
> Ed: "if the NLRI field is contained" instead?

No, as the NLRI field is always present in the UPDATE message
(although if NLRI is not present, then the NLRI field is empty).

> 
> > 5.1.2 AS_PATH
> ...
> >       b) When a given BGP speaker advertises the route to an external
> >       peer, then the advertising speaker updates the AS_PATH attribute
> >       as follows:
> > 
> >          1) if the first path segment of the AS_PATH is of type
> >          AS_SEQUENCE, the local system prepends its own AS number as the
> >          last element of the sequence (put it in the leftmost position).
> 
> 'Leftmost position'... isn't this still open for interpretation? How
> about wording this relative to the position of the octets in the
> protocol message?

I'll replace "the leftmost position" with "the leftmost position with
respect to the position of octets in the protocol message".

> >          If the act of prepending will cause an overflow in the AS_PATH
> >          segment, i.e. more than 255 ASs, it is legal to prepend a new
> >          segment of type AS_SEQUENCE and prepend its own AS number to
> >          this new segment.
> 
> What's the recommended behavior here?

"it is legal to prepend" really means "it SHOULD prepend". 
In the absence of any objections by June 10 I'll update the text.

> 
> 
> > 5.1.4 MULTI_EXIT_DISC
> > 
> > 
> >    The MULTI_EXIT_DISC is an optional non-transitive attribute which is
> >    intended to be used on external (inter-AS) links to discriminate
> >    among multiple exit or entry points to the same neighboring AS.  The
> >    value of the MULTI_EXIT_DISC attribute is a four octet unsigned num-
> >    ber which is called a metric. All other factors being equal, the exit
> >    point with lower metric SHOULD be preferred. If received over EBGP,
> >    the MULTI_EXIT_DISC attribute MAY be propagated over IBGP to other
> >    BGP speakers within the same AS. The MULTI_EXIT_DISC attribute
> 
> seems that a reference to 9.1.2.2 is due here, as using MED in local
> route calculation and not propagating it further is dangerous

Sure.

> >    received from a neighboring AS MUST NOT be propagated to other neigh-
> >    boring ASs.
> > 
> >    A BGP speaker MUST IMPLEMENT a mechanism based on local configuration
>                         ^^^^^^^^^lower-case

Sure.

> >    which allows the MULTI_EXIT_DISC attribute to be removed from a
> >    route. This MAY be done prior to determining the degree of preference
> 
> what's the recommended behavior here?

What the text is saying is that a BGP speaker optionally (MAY)
remove MED from a route. If the speaker does this, then this *has
to* happen prior to determining the degree of preference for the
route. So, what "This MAY" refers to is the fact that removing MED
is optional. To clarify I would replace "This MAY be done" with
"Removal of the MULTI_EXIT_DISC attribute MAY be done".

> >    of the route and performing route selection (decision process phases
> >    1 and 2).
> > 
> >    An implementation MAY also (based on local configuration) alter the
> >    value of the MULTI_EXIT_DISC attribute received over EBGP.  This MAY
> >    be done prior to determining the degree of preference of the route
> 
> what's the recommended behavior here?

The same as the previous comment.

> > 5.1.5 LOCAL_PREF
> ...
> > A BGP speaker SHALL calculate the degree of preference for
> >    each external route based on the locally configured policy, and
> 
> Should we be more honest here and say that the implementation must
> allow the admin to SET the degree of preference through the local
> policy to influence the best-path selection process, i.e., I don't
> think any implementation really *calculates* it.

Please see my answer to you comment on 9.1.1

> > 5.1.6 ATOMIC_AGGREGATE
> ...
> >    A BGP speaker that receives a route with the ATOMIC_AGGREGATE
> >    attribute MUST NOT make any NLRI of that route more specific (as
> >    defined in 9.1.4) when advertising this route to other BGP speakers.
> 
> Since deaggregation is not described in this document, do we need this
> para?

I would prefer to keep the current text, as to make sure that an
implementation wouldn't do deaggregation.

> >   A BGP speaker that receives a route with the ATOMIC_AGGREGATE
> >    attribute needs to be cognizant of the fact that the actual path to
> >    destinations, as specified in the NLRI of the route, while having the
> >    loop-free property, may not be the path specified in the AS_PATH
> >    attribute of the route.
> 
> What does this really mean from the implementation perspective?

This is mostly FYI. It has to do with the user of BGP...

> > 5.1.7 AGGREGATOR
> > 
> > 
> >    AGGREGATOR is an optional transitive attribute which MAY be included
> >    in updates which are formed by aggregation (see Section 9.2.2.2). A
> >    BGP speaker which performs route aggregation MAY add the AGGREGATOR
> 
> What's the recommended behavior here? Include or not, and under what
> circumstances?

The spec doesn't provide any recommendation on this, as it is optional.

> > 6. BGP Error Handling.
> ...
> >    The phrase "the BGP connection is closed" means that the TCP connec-
> >    tion has been closed, the associated Adj-RIB-In has been cleared, and
> >    that all resources for that BGP connection have been deallocated.
> >    Entries in the Loc-RIB associated with the remote peer are marked as
> >    invalid. The fact that the routes have become invalid is passed to
> >    other BGP peers before the routes are deleted from the system.
> 
> What does "the fact is passed" mean? Should we instead say that local
> route recalculation happens and peers are sent either updated best
> routes or withdrawals?

How about the following replacement for the last sentence:

   The local system recalculates its best routes for the destinations
   of the routes marked as invalid, and advertises to its peers either
   withdraws for the routes marked as invalid, or the new best routes
   before the invalid routes are deleted from the system.

> > 6.2 OPEN message error handling.
> ...
> >    If the Autonomous System field of the OPEN message is unacceptable,
> >    then the Error Subcode is set to Bad Peer AS. The determination of
> >    acceptable Autonomous System numbers is outside the scope of this
> >    protocol.
> 
> Shouldn't we say that configuration based detection should be
> supported, i.e., when remote-as is configured for the peer?

No.

> ...
> >   If the BGP Identifier field of the OPEN message is syntactically
> >    incorrect, then the Error Subcode is set to Bad BGP Identifier.  Syn-
> >    tactic correctness means that the BGP Identifier field represents a
> >    valid IP host address.
> 
> Is "valid IP host address" defined somewhere, btw?

Certainly not in this document. Perhaps for clarity I'll add
"unicast" in front of "IP host address".

> > 6.3 UPDATE message error handling.
> > 
> > 
> >    All errors detected while processing the UPDATE message are indicated
> >    by sending the NOTIFICATION message with Error Code UPDATE Message
> >    Error. The error subcode elaborates on the specific nature of the
> >    error.
> 
> "are indicated..." is this a MUST, SHOULD, or MAY?

MUST.

> ...
> >    If the ORIGIN attribute has an undefined value, then the Error Sub-
> >    code is set to Invalid Origin Attribute. The Data field contains the
> >    unrecognized attribute (type, length and value).
> 
> Curious: do we really have to drop a session on this condition? Given
> that the attribute was syntactically correct and the TLV was not
> broken, so the stream is still in sync and we can move on? Of course,
> if this is what current implementations do, we have no other choice.

In the current spec all the errors are fatal. Including errors
in the ORIGIN attribute.
  
> ...
> >    If the UPDATE message is received from an external peer, the local
> >    system MAY check whether the leftmost AS in the AS_PATH attribute is
> 
> Same comment about 'leftmost'... Maybe we should define this somewhere
> in the beginning of the spec?

I will replace "the leftmost AS" with "the leftmost AS with
respect to the position of octets in the protocol message".
  
> ...
> >    The NLRI field in the UPDATE message is checked for syntactic valid-
> >    ity. If the field is syntactically incorrect, then the Error Subcode
> >    is set to Invalid Network Field.
> 
> Should we give more data on what syntactic validity means in this case
> so people behave consistently?

As Curtis suggested a while ago:

     If the document is unclear to the well qualified reader (one
     possessing a thorough understanding of foundations of this work,
     including IP routing, TCP, TCP programming, and the referenced
     documents) then the document may need to be changed to improve
     clarity.

The case you mentioned above suggests that the reader is not
well qualified.

> > 6.7 Cease.
> ...
> > If the BGP speaker decides to terminate its BGP
> >    connection with a neighbor because the number of address prefixes
> >    received from the neighbor exceeds the locally configured upper
> >    bound, then the speaker MUST send to the neighbor a NOTIFICATION mes-
> >    sage with the Error Code Cease.
> 
> Should we also say that when the peer decides to discard incoming
> prefixes, this event should be logged locally?

In the absence of any objections by June 10 I'll add the following to 
the text:

    The speaker MAY also log this locally.

> > 9. UPDATE Message Handling
> > 
> > 
> >    An UPDATE message may be received only in the Established state.
>
> What if it is received in another state?

It is an error. To make this clear I'll add the following to the
text:

   Receiving an UPDATE message in any other state is an error.

> ...
> > 9.1 Decision Process
> > 
> > 
> >    The Decision Process selects routes for subsequent advertisement by
> >    applying the policies in the local Policy Information Base (PIB) to
> >    the routes stored in its Adj-RIBs-In. The output of the Decision Pro-
> >    cess is the set of routes that will be advertised to peers; the
> >    selected routes will be stored in the local speaker's Adj-RIB-Out
> RIB-Out or RIBs-out (plural)?

Plural.

> >    according to policy.
> > 
> >    The selection process is formalized by defining a function that takes
> >    the attribute of a given route as an argument and returns either (a)
> >    a non-negative integer denoting the degree of preference for the
> >    route, or (b) a value denoting that this route is ineligible to be
> >    installed in LocRib and will be excluded from the next phase of route
> 
> Loc-RIB

Ok.

> >    selection.
> ...
> >    The Decision Process operates on routes contained in the Adj-RIB-In,
> Adj-RIBs-In (plural) ?

Plural.

> >    and is responsible for:
> 
> > 9.1.1 Phase 1: Calculation of Degree of Preference
> ...
> >       If the route is learned from an external peer, then the local BGP
> >       speaker computes the degree of preference based on preconfigured
> >       policy information. If the return value indicates that the route
> >       is ineligible, the route MAY NOT serve as an input to the next
> >       phase of route selection; otherwise the return value is used as
> >       the LOCAL_PREF value in any IBGP readvertisement.
> 
> So, AFAIK, the major implementations do not follow this step
> (calculating the degree of preference, and then announcing). Instead,
> implementations allow setting the LOCAL_PREF value locally, which is
> taken into consideration during the best path selection, and is also
> reannounced further.

It is important to keep in mind that the whole section on the BGP
decision process does *not* mean that an implementation must implement
it precisely as it is described in the spec, as long as the implementation 
support the described functionality and its externally visible behavior 
is the same. With this in mind how about if I'll add the following:

   The BGP Decision Process in this document is conceptual and do
   not have to be implemented precisely as described here, as long
   as the implementations support the described functionality and
   their externally visible behavior is the same.

> Also "is used" is not specific enough. Is it SHOULD or MUST?

MUST.

> > 9.1.2 Phase 2: Route Selection
> ...
> >    If the AS_PATH attribute of a BGP route contains an AS loop, the BGP
> >    route should be excluded from the Phase 2 decision function.  AS loop
> >    detection is done by scanning the full AS path (as specified in the
> >    AS_PATH attribute), and checking that the autonomous system number of
> >    the local system does not appear in the AS path.  Operations of a BGP
> >    speaker that is configured to accept routes with its own autonomous
> >    system number in the AS path are outside the scope of this document.
> 
> If we're checking for an AS loop here (in Phase 2) as opposed to
> during the UPDATE message sanity checking, the route is already
> received and accepted in the peer's Adj-RIB-In. Those implementations
> I know don't even install such routes in the RIB...

This is the text that the WG agreed on (see e-mail from John Scudder on
Mon, 02 Dec 2002 10:54:45 EST). Also, see my response to your previous
comment.

> > 9.1.2.2 Breaking Ties (Phase 2)
> ...
> >       Similarly, neighborAS(n) is a function which returns the neighbor
> >       AS from which the route was received.  If the route is learned via
> >       IBGP, and the other IBGP speaker didn't originate the route, it is
> >       the neighbor AS from which the other IBGP speaker learned the
> >       route. If the route is learned via IBGP, and the other IBGP
> >       speaker originated the route, it is the local AS.
> 
> What if the route is locally originated?

Breaking ties has to do with the routes received from other BGP speakers,
not with the routes locally originated.

> > 9.1.4 Overlapping Routes
> ...
> >    When overlapping routes are present in the same Adj-RIB-In, the more
> >    specific route takes precedence, in order from more specific to least
> >    specific.
> > 
> Doesn't this happen at the packet forwarding stage?

Yes, it does. But only if both routes are present in the FIB.
I also think that this sentence isn't needed, so in the absence
of any objections by June 10 I propose to remove it.

> >    The set of destinations described by the overlap represents a portion
> >    of the less specific route that is feasible, but is not currently in
> >    use.  If a more specific route is later withdrawn, the set of desti-
> >    nations described by the overlap will still be reachable using the
> >    less specific route.
> > 
> >    If a BGP speaker receives overlapping routes, the Decision Process
> >    MUST consider both routes based on the configured acceptance policy.
> >    If both a less and a more specific route are accepted, then the Deci-
> >    sion Process MUST either install both the less and the more specific
>   
> Install where?

In Loc-RIB. I'll insert "in Loc-RIB" to make this clear.

> >    routes or it MUST aggregate the two routes and install the aggregated
> >    route, provided that both routes have the same value of the NEXT_HOP
> >    attribute.
> 
> anyone really does the latter?

Will find this from the implemenation report.

> >    If a BGP speaker chooses to aggregate, then it SHOULD either include
> >    all AS used to form the aggreagate in an AS_SET or add the
> >    ATOMIC_AGGREGATE attribute to the route.  This attribute is now pri-
> >    marily informational.  With the elimination of IP routing protocols
> >    that do not support classless routing and the elimination of router
> >    and host implementations that do not support classless routing, there
> >    is no longer a need to deaggregate.  Routes SHOULD NOT be de-aggre-
> >    gated.  A route that carries ATOMIC_AGGREGATE attribute in particular
> >    MUST NOT be de-aggregated. That is, the NLRI of this route can not be
> >    made more specific. Forwarding along such a route does not guarantee
> >    that IP packets will actually traverse only ASs listed in the AS_PATH
> >    attribute of the route.
> 
> Since we don't do deaggregation any more, should we remove the
> discussion about it completely and indicate in the "changes" section
> that deaggregation has been deprecated?

As I said before, I would prefer to keep the text on de-aggregation in.

> > 9.2 Update-Send Process
> ...
> >    When a BGP speaker receives an UPDATE message from an internal peer,
> >    the receiving BGP speaker SHALL NOT re-distribute the routing infor-
> >    mation contained in that UPDATE message to other internal peers,
> >    unless the speaker acts as a BGP Route Reflector [RFC2796].
> 
> Suggest to put "unless..." in brackets () to make it more apparent
> that this is not a normative ref.

Ok.

> > 9.2.1.1 Frequency of Route Advertisement
> >    Since fast convergence is needed within an autonomous system, either
> >    (a) the MinRouteAdvertisementInterval used for internal peers SHOULD
> >    be shorter than the MinRouteAdvertisementInterval used for external
> >    peers, or (b) the procedure describe in this section SHOULD NOT apply
> >    for routes sent to internal peers.
> 
> It sounded like MinRouteAdvertisementInterval was an architectural
> constant, but now it sounds like either this is a timer that can be
> assigned different settings or there are two constants:
> MinRouteAdvIntIBGP and MinRouteAdvIntEBGP.

There is a timer (MinRouteAdvertisementInterval) that can be assigned 
different settings.
  
> > 9.2.2.2 Aggregating Routing Information
> > 
> 
> Hmmm... I expected to see in this section some text talking about when
> and how an aggregate would be announced, i.e., when an aggregate
> prefix is configured, and more specific routes are present, the
> aggregate is announced, when no specifics are left--withdraw the
> aggregate. I haven't found anything on this topic...

That is outside the scope of the *protocol* spec. See rfc1519 for
more on this.

> > 9.3 Route Selection Criteria
> >
> >    Generally speaking, additional rules for comparing routes among sev-
> >    eral alternatives are outside the scope of this document. There are
> >    two exceptions:
> > 
> >       - If the local AS appears in the AS path of the new route being
> >       considered, then that new route can not be viewed as better than
> >       any other route (provided that the speaker is configured to accept
> >       such routes). If such a route were ever used, a routing loop could
> >       result.
> > 
> >       - In order to achieve successful distributed operation, only
> >       routes with a likelihood of stability can be chosen. Thus, an AS
> >       SHOULD avoid using unstable routes, and it SHOULD NOT make rapid
> >       spontaneous changes to its choice of route. Quantifying the terms
> >       "unstable" and "rapid" in the previous sentence will require expe-
> >       rience, but the principle is clear.
> 
> Where does this (the second one) fit within and how does this affect
> the route selection criteria?

Routes that flap often can be "penalize" (e.g., route dampening).
I'll add a pointer to the route dampening spec here.
  
> >    Care must be taken to ensure that BGP speakers in the same AS do not
> >    make inconsistent decisions.
> 
> How? 

By means outside of the protocol. How about if I'll just remove this
sentence ?

> What does this mean for the implementor?
>
> > 9.4 Originating BGP routes
> > 
> >    A BGP speaker may originate BGP routes by injecting routing informa-
> >    tion acquired by some other means (e.g. via an IGP) into BGP. A BGP
> >    speaker that originates BGP routes assigns the degree of preference
> > 
> 
> "assigns the degree of preference"... how do the implementations
> really do that?

E.g., via CLI. I'll add "(e.g., via CLI") after "assigns the degree
of preference".
  
> > 10 BGP Timers
> ...
> >    The suggested default value for the MinRouteAdvertisementInterval is
> >    30 seconds.
> 
> This was described as a parameter, not a timer. Further, it was
> earlier suggested that it should be shorter for iBGP than it is for
> eBGP. I'd expect the document to specify the recommended value for
> both.

This is for eBGP. For iBGP the suggested value is 5 secs (I'll add this
to the draft).

> > IANA Considerations
> ...
> >    All extensions to this protocol, including new message types and Path
> >    Attributes MUST only be made using the Standards Action process
> >    defined in [RFC2434].
> 
> This section should include the description of each registry that
> needs to be created (if needed) and maintained by IANA, as well as the
> allocation policy that is in the text already.

Sure.

Yakov.


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id QAA13684 for <idr-archive@nic.merit.edu>; Fri, 23 May 2003 16:52:03 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 8912791250; Fri, 23 May 2003 16:51:35 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 5CBDD91251; Fri, 23 May 2003 16:51:35 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 6106F91250 for <idr@trapdoor.merit.edu>; Fri, 23 May 2003 16:51:34 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 3F3995DE37; Fri, 23 May 2003 16:51:34 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from dog.tcb.net (dog.tcb.net [64.78.150.133]) by segue.merit.edu (Postfix) with ESMTP id 1B30D5DE29 for <idr@merit.edu>; Fri, 23 May 2003 16:51:34 -0400 (EDT)
Received: from [192.168.1.39] (vdsl-151-118-3-177.dnvr.uswest.net [151.118.3.177]) by dog.tcb.net (Postfix) with ESMTP id 722DC2029E for <idr@merit.edu>; Fri, 23 May 2003 14:56:55 -0600 (MDT)
User-Agent: Microsoft-Entourage/10.0.0.1309
Date: Fri, 23 May 2003 14:51:16 -0600
Subject: Re: EBGP - Setting Nexthop
From: Danny McPherson <danny@tcb.net>
To: <idr@merit.edu>
Message-ID: <BAF3E5E4.6501%danny@tcb.net>
In-Reply-To: <006c01c32168$9e56a620$cbc8c8c8@sdksoft.com>
Mime-version: 1.0
Content-type: text/plain; charset="US-ASCII"
Content-transfer-encoding: 7bit
Sender: owner-idr@merit.edu
Precedence: bulk

On 5/23/03 2:19 PM, "Parag Deshpande" <paragdeshpande@sdksoft.com> wrote:

> Thanks Danny,
> 
> Then what I get is that it really doesn't matter what the default
> behavior is since knobs (policies) can be used to modify NEXT_HOP
> as needed. (?)

Correct.  Both on the transmit and receive (e.g., enforce next-hop)
sides.

-danny




Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id QAA13514 for <idr-archive@nic.merit.edu>; Fri, 23 May 2003 16:19:24 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 751749124B; Fri, 23 May 2003 16:19:02 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 4AD9F9124D; Fri, 23 May 2003 16:19:02 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id CD7829124B for <idr@trapdoor.merit.edu>; Fri, 23 May 2003 16:19:00 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id A74965DDF4; Fri, 23 May 2003 16:19:00 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from mpls-qmqp-01.inet.qwest.net (mpls-qmqp-01.inet.qwest.net [63.231.195.112]) by segue.merit.edu (Postfix) with SMTP id 543CA5DE03 for <idr@merit.edu>; Fri, 23 May 2003 16:19:00 -0400 (EDT)
Received: (qmail 34830 invoked by uid 0); 23 May 2003 20:19:00 -0000
Received: from unknown (63.231.195.5) by mpls-qmqp-01.inet.qwest.net with QMQP; 23 May 2003 20:19:00 -0000
Received: from 0-1pool172-208.nas17.minneapolis1.mn.us.da.qwest.net (HELO charita) (67.4.172.208) by mpls-pop-05.inet.qwest.net with SMTP; 23 May 2003 20:18:59 -0000
Date: Fri, 23 May 2003 15:19:30 -0500
Message-ID: <006c01c32168$9e56a620$cbc8c8c8@sdksoft.com>
From: "Parag Deshpande" <paragdeshpande@sdksoft.com>
To: "Danny McPherson" <danny@tcb.net>, idr@merit.edu
References: <BAF3CC9F.64AB%danny@tcb.net>
Subject: Re: EBGP - Setting Nexthop
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.00.2615.200
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2615.200
Sender: owner-idr@merit.edu
Precedence: bulk

Thanks Danny,

Then what I get is that it really doesn't matter what the default behavior
is since
knobs (policies) can be used to modify NEXT_HOP as needed. (?)

Parag

> > Hi,
> >
> > I have a doubt regarding setting of nexthop value in the following
scenario:
> >
> > Router R has 2 ebgp peers A and B, all on same subnet S1.
> > A - Sends a prefix to R with nexthop N1 where N1 = A.
> > R - Installs and then forwards the prefix to B with N1 = ?
> >
> > In this case should R set N1 = A or N1 = R (on S1).
>
> It's a matter of policy, really.  For instance, perhaps A and B don't peer
> directly and A doesn't want to accept packets directly from B, so setting
a
> third party NEXT_HOP may break things (e.g., Link Layer filtering is
> implemented or no Layer 2 connection exists directly between A and B, even
> though A, B & R share a common subnet) or violate some policy (In a
previous
> job we peered with a network at a multi-access exchange point -- purely
out
> of goodwill.  They began sending us lots of traffic and after some
> investigation we realized they were selling transit across the local
> exchange point via readvertising our routes to their transit customers and
> preserving the NEXT_HOP, such that their customers were sending traffic
> directly to us -- they never touched the outbound traffic!  Needless to
say,
> MAC-Layer filtering was deployed shortly thereafter).
>
> On the other hand, perhaps they're all in agreement that this is a fine
> thing in order to optimize the forwarding path AND connectivity exists
such
> that B can send traffic directly to A -- so it makes sense for R to
preserve
> the NEXT_HOP.
>
> > I saw a major vendor setting it to R. Is that a preffered practice?
> > If yes why?
>
> Again, it's all a matter of policy, and all the "major vendors" I'm
familiar
> with provide the knobs to set it pretty much however you prefer, though I
> have seen some variances in default behaviors.
>
> -danny
>
>
>



Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id PAA13100 for <idr-archive@nic.merit.edu>; Fri, 23 May 2003 15:04:13 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id AF4FB91249; Fri, 23 May 2003 15:03:46 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 80F359124A; Fri, 23 May 2003 15:03:46 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 6EE8291249 for <idr@trapdoor.merit.edu>; Fri, 23 May 2003 15:03:45 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 4D3A95DD91; Fri, 23 May 2003 15:03:45 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from dog.tcb.net (dog.tcb.net [64.78.150.133]) by segue.merit.edu (Postfix) with ESMTP id 280595DD8D for <idr@merit.edu>; Fri, 23 May 2003 15:03:45 -0400 (EDT)
Received: from [192.168.1.39] (vdsl-151-118-3-177.dnvr.uswest.net [151.118.3.177]) by dog.tcb.net (Postfix) with ESMTP id 29964202A0 for <idr@merit.edu>; Fri, 23 May 2003 13:09:06 -0600 (MDT)
User-Agent: Microsoft-Entourage/10.0.0.1309
Date: Fri, 23 May 2003 13:03:27 -0600
Subject: Re: EBGP - Setting Nexthop
From: Danny McPherson <danny@tcb.net>
To: <idr@merit.edu>
Message-ID: <BAF3CC9F.64AB%danny@tcb.net>
In-Reply-To: <001501c3214a$c0613b40$cbc8c8c8@sdksoft.com>
Mime-version: 1.0
Content-type: text/plain; charset="US-ASCII"
Content-transfer-encoding: 7bit
Sender: owner-idr@merit.edu
Precedence: bulk

On 5/23/03 10:45 AM, "Parag Deshpande" <paragdeshpande@sdksoft.com> wrote:

> Hi,
> 
> I have a doubt regarding setting of nexthop value in the following scenario:
> 
> Router R has 2 ebgp peers A and B, all on same subnet S1.
> A - Sends a prefix to R with nexthop N1 where N1 = A.
> R - Installs and then forwards the prefix to B with N1 = ?
> 
> In this case should R set N1 = A or N1 = R (on S1).

It's a matter of policy, really.  For instance, perhaps A and B don't peer
directly and A doesn't want to accept packets directly from B, so setting a
third party NEXT_HOP may break things (e.g., Link Layer filtering is
implemented or no Layer 2 connection exists directly between A and B, even
though A, B & R share a common subnet) or violate some policy (In a previous
job we peered with a network at a multi-access exchange point -- purely out
of goodwill.  They began sending us lots of traffic and after some
investigation we realized they were selling transit across the local
exchange point via readvertising our routes to their transit customers and
preserving the NEXT_HOP, such that their customers were sending traffic
directly to us -- they never touched the outbound traffic!  Needless to say,
MAC-Layer filtering was deployed shortly thereafter).

On the other hand, perhaps they're all in agreement that this is a fine
thing in order to optimize the forwarding path AND connectivity exists such
that B can send traffic directly to A -- so it makes sense for R to preserve
the NEXT_HOP.  

> I saw a major vendor setting it to R. Is that a preffered practice?
> If yes why?

Again, it's all a matter of policy, and all the "major vendors" I'm familiar
with provide the knobs to set it pretty much however you prefer, though I
have seen some variances in default behaviors.

-danny




Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id MAA12196 for <idr-archive@nic.merit.edu>; Fri, 23 May 2003 12:46:20 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 755669123D; Fri, 23 May 2003 12:45:17 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 3CFB591244; Fri, 23 May 2003 12:45:17 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 7A7C59123D for <idr@trapdoor.merit.edu>; Fri, 23 May 2003 12:45:15 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 3CF325E77A; Fri, 23 May 2003 12:45:14 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from mpls-qmqp-01.inet.qwest.net (mpls-qmqp-01.inet.qwest.net [63.231.195.112]) by segue.merit.edu (Postfix) with SMTP id 5A1465E082 for <idr@merit.edu>; Fri, 23 May 2003 12:45:12 -0400 (EDT)
Received: (qmail 19220 invoked by uid 0); 23 May 2003 16:45:12 -0000
Received: from unknown (63.231.195.13) by mpls-qmqp-01.inet.qwest.net with QMQP; 23 May 2003 16:45:12 -0000
Received: from 0-1pool172-208.nas17.minneapolis1.mn.us.da.qwest.net (HELO charita) (67.4.172.208) by mpls-pop-13.inet.qwest.net with SMTP; 23 May 2003 16:45:12 -0000
Date: Fri, 23 May 2003 11:45:38 -0500
Message-ID: <001501c3214a$c0613b40$cbc8c8c8@sdksoft.com>
From: "Parag Deshpande" <paragdeshpande@sdksoft.com>
To: idr@merit.edu
References: <20030520145557.G16646@nexthop.com>
Subject: EBGP - Setting Nexthop
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 5.00.2615.200
X-MimeOLE: Produced By Microsoft MimeOLE V5.00.2615.200
Sender: owner-idr@merit.edu
Precedence: bulk

Hi,

I have a doubt regarding setting of nexthop value in the following scenario:

Router R has 2 ebgp peers A and B, all on same subnet S1.
A - Sends a prefix to R with nexthop N1 where N1 = A.
R - Installs and then forwards the prefix to B with N1 = ?

In this case should R set N1 = A or N1 = R (on S1).

I saw a major vendor setting it to R. Is that a preffered practice?
If yes why?

Reference:
5.1.3 NEXT_HOP
      2) When sending a message to an external peer X, and the peer is
      one IP hop away from the speaker:
      ........
       - Otherwise, if the route being announced was learned from an
         external peer, the speaker can use in the NEXT_HOP attribute an
         IP address of any adjacent router (known from the received
         NEXT_HOP attribute) that the speaker itself uses for local
         route calculation, provided that peer X shares a common subnet
         with this address. This is a second form of "third party"
         NEXT_HOP attribute.
>> N1 = A
         - Otherwise, if the external peer to which the route is being
         advertised shares a common subnet with one of the interfaces of
         the announcing BGP speaker, the speaker MAY use the IP address
         associated with such an interface in the NEXT_HOP attribute.
         This is known as a "first party" NEXT_HOP attribute.
>> N1 = R
       .....

Thanks
Parag



Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id PAA19873 for <idr-archive@nic.merit.edu>; Wed, 21 May 2003 15:25:48 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 2F76391235; Wed, 21 May 2003 15:25:29 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id EF42C9123D; Wed, 21 May 2003 15:25:28 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id A853D91235 for <idr@trapdoor.merit.edu>; Wed, 21 May 2003 15:25:27 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 89C525DFD3; Wed, 21 May 2003 15:25:27 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from ietf.org (odin.ietf.org [132.151.1.176]) by segue.merit.edu (Postfix) with ESMTP id 7EAED5DFD0 for <idr@merit.edu>; Wed, 21 May 2003 15:25:25 -0400 (EDT)
Received: from CNRI.Reston.VA.US (localhost [127.0.0.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id PAA04849; Wed, 21 May 2003 15:25:14 -0400 (EDT)
Message-Id: <200305211925.PAA04849@ietf.org>
To: IETF-Announce: ;
Cc: RFC Editor <rfc-editor@isi.edu>, Internet Architecture Board <iab@iab.org>, idr@merit.edu
From: The IESG <iesg-secretary@ietf.org>
Subject: Document Action: Security Requirements for Keys used with the  TCP MD5 Signature Option to Informational
Date: Wed, 21 May 2003 15:25:14 -0400
Sender: owner-idr@merit.edu
Precedence: bulk

The IESG has approved the Internet-Draft 'Security Requirements for 
Keys used with the TCP MD5 Signature Option' 
<draft-ietf-idr-md5-keys-00.txt> as an Informational RFC.  This 
document is the product of the Inter-Domain Routing Working Group.  The 
IESG contact persons are Bill Fenner and Alex Zinin.
 
 
RFC Editor Note:

Please change the title to "Key Management Considerations for the
TCP MD5 Signature Option".

Please change the following:

In section 3, the first bullet:
OLD:
      o Key lengths SHOULD be between 12 and 24 bytes, with larger keys
        having effectively zero cost when compared to shorter keys.

NEW:
      o Key lengths SHOULD be between 12 and 24 bytes, with larger keys
        having effectively zero additional computational cost when
                            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        compared to shorter keys.

In section 5, first paragraph:

OLD:
  this option may have lifetimes on the order of months.  It would seem
  prudent, then, to choose a *minimum* key length that guarantees that
  key-guessing runtimes are some reasonable [3-5??] multiple of the
  key-change interval under best-case (for the attacker) practical

NEW:
  this option may have lifetimes on the order of months.  It would seem
  prudent, then, to choose a minimum key length that guarantees that
                              ^^^^^^^ (remove emphasis)
  key-guessing runtimes are some small multiple of the key-change
                                  ^^^^^^^^^^^^^^
  interval under best-case (for the attacker) practical

In section 6, first paragraph:

OLD:
  that the reasonable upper-bound for software-based attack performance
  is 1.0e13 MD5 operations per second, then the *minimum* required key
  entropy is approximately 68 bits.  It is reasonable to round this

NEW:
  that the reasonable upper-bound for software-based attack performance
  is 1.0e13 MD5 operations per second, then the minimum required key
                                                ^^^^^^^ (remove emphasis)
  entropy is approximately 68 bits.  It is reasonable to round this



Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id OAA09545 for <idr-archive@nic.merit.edu>; Tue, 20 May 2003 14:57:00 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 85A559126D; Tue, 20 May 2003 14:56:41 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 5938E9126E; Tue, 20 May 2003 14:56:41 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 474249126D for <idr@trapdoor.merit.edu>; Tue, 20 May 2003 14:56:40 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 347275DF33; Tue, 20 May 2003 14:56:40 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from presque.nexthop.com (dns.nexthop.com [65.247.36.216]) by segue.merit.edu (Postfix) with ESMTP id 0255E5DEE3 for <idr@merit.edu>; Tue, 20 May 2003 14:56:39 -0400 (EDT)
Received: (from root@localhost) by presque.nexthop.com (8.12.8/8.11.1) id h4KIu8Q3028978; Tue, 20 May 2003 14:56:08 -0400 (EDT) (envelope-from jhaas@jhaas.nexthop.com)
Received: from jhaas.nexthop.com (jhaas.nexthop.com [65.247.36.31]) by presque.nexthop.com (8.12.8/8.12.8) with ESMTP id h4KIu28o028964; Tue, 20 May 2003 14:56:02 -0400 (EDT) (envelope-from jhaas@jhaas.nexthop.com)
Received: (from jhaas@localhost) by jhaas.nexthop.com (8.11.3nb1/8.11.3) id h4KItv618075; Tue, 20 May 2003 14:55:57 -0400 (EDT)
Date: Tue, 20 May 2003 14:55:57 -0400
From: Jeffrey Haas <jhaas@nexthop.com>
To: idr@merit.edu
Cc: rtg-dir@ietf.org
Subject: [ruwhite@cisco.com: Re: Comments on BGP Draft 20.....]
Message-ID: <20030520145557.G16646@nexthop.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
X-Virus-Scanned: by AMaViS perl-11
Sender: owner-idr@merit.edu
Precedence: bulk

Yakov,

----- Forwarded message from Russ White <ruwhite@cisco.com> -----

Date: Tue, 20 May 2003 14:33:38 -0400 (EDT)
From: Russ White <ruwhite@cisco.com>
To: Jeffrey Haas <jhaas@nexthop.com>
Subject: Re: Comments on BGP Draft 20.....
Reply-To: Russ White <riw@cisco.com>
X-Virus-Scanned: by AMaViS perl-11
X-OriginalArrivalTime: 20 May 2003 18:33:51.0941 (UTC) FILETIME=[5C876750:01C31EFE]


Yeah, this sounds better....

:-)

Russ

On Tue, 20 May 2003, Jeffrey Haas wrote:

> [off-list]
>
> Howzabout:
>    The primary function of a BGP speaking system is to exchange network
>    reachability information with other BGP systems. This network reacha-
>    bility information includes information on the list of Autonomous
>    Systems (ASs) that reachability information traverses.  This informa-
>    tion is sufficient to construct a graph of AS connectivity
> +  for this reachability
>    from which
>    routing loops may be pruned and some policy decisions at the AS level
>    may be enforced.
>
> --
> Jeff Haas
> NextHop Technologies
>

__________________________________
riw@cisco.com CCIE <>< Grace Alone


----- End forwarded message -----

-- 
Jeff Haas 
NextHop Technologies


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id OAA09318 for <idr-archive@nic.merit.edu>; Tue, 20 May 2003 14:29:46 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 11EE19126A; Tue, 20 May 2003 14:29:20 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id D5AD99126B; Tue, 20 May 2003 14:29:19 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id C3A119126A for <idr@trapdoor.merit.edu>; Tue, 20 May 2003 14:29:18 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id AC8CC5DF33; Tue, 20 May 2003 14:29:18 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from presque.nexthop.com (dns.nexthop.com [65.247.36.216]) by segue.merit.edu (Postfix) with ESMTP id 809445DF30 for <idr@merit.edu>; Tue, 20 May 2003 14:29:18 -0400 (EDT)
Received: (from root@localhost) by presque.nexthop.com (8.12.8/8.11.1) id h4KISbge028126; Tue, 20 May 2003 14:28:37 -0400 (EDT) (envelope-from jhaas@jhaas.nexthop.com)
Received: from jhaas.nexthop.com (jhaas.nexthop.com [65.247.36.31]) by presque.nexthop.com (8.12.8/8.12.8) with ESMTP id h4KISX8o028114; Tue, 20 May 2003 14:28:33 -0400 (EDT) (envelope-from jhaas@jhaas.nexthop.com)
Received: (from jhaas@localhost) by jhaas.nexthop.com (8.11.3nb1/8.11.3) id h4KISSp17972; Tue, 20 May 2003 14:28:28 -0400 (EDT)
Date: Tue, 20 May 2003 14:28:28 -0400
From: Jeffrey Haas <jhaas@nexthop.com>
To: Yakov Rekhter <yakov@juniper.net>
Cc: Russ White <riw@cisco.com>, idr@merit.edu, rtg-dir@ietf.org
Subject: Re: Comments on BGP Draft 20.....
Message-ID: <20030520142828.E16646@nexthop.com>
References: <Pine.OSX.4.51.0305201307370.23356@dhcp-64-102-48-215.cisco.com> <200305201810.h4KIAbu27841@merlot.juniper.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <200305201810.h4KIAbu27841@merlot.juniper.net>; from yakov@juniper.net on Tue, May 20, 2003 at 11:10:37AM -0700
X-Virus-Scanned: by AMaViS perl-11
Sender: owner-idr@merit.edu
Precedence: bulk

On Tue, May 20, 2003 at 11:10:37AM -0700, Yakov Rekhter wrote:
> > Okay:
> > 
> > The Loc-RIB contains routes which are installed in the local routing table
> > and used for forwarding packets received, based on the destination address,
> > by the router.
> 
> In the absence of any objections within a week I'll put this in the text.

Except:
:   Whether or not the new BGP route replaces an existing
:   non-BGP route in the Routing Table depends on the policy configured
:   on the BGP speaker.

I think the existing text is fine.  The gotcha is one has to read ahead
a bit in the document to find the bit I just quoted.

> > It's value MUST NOT be changed by any other speaker.
> 
> 
> In the absence of any objections within a week I'll put this in the text.

Good grief.

I refer all parties concerned back to the thread titled "Re: issue 32.1",
specifically the consensus mail from Andrew with message-id:
<20020927115144.F13901@demiurge.exodus.net

Short summary:
1. You shouldn't change it.
2. People *do* change it, and do so for policy reasons.
3. You shouldn't change it, but since people do, we're going to tell
   you that you shouldn't and thus imply that you can if you really
   want to. :-)

My own preference was to *not* change it and MUST would be fine with me,
but consensus was previously reached.

> Yakov.

-- 
Jeff Haas 
NextHop Technologies


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id OAA09217 for <idr-archive@nic.merit.edu>; Tue, 20 May 2003 14:11:07 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id DABC891268; Tue, 20 May 2003 14:10:46 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id A658191269; Tue, 20 May 2003 14:10:46 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 5143091268 for <idr@trapdoor.merit.edu>; Tue, 20 May 2003 14:10:45 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 3D3335DF11; Tue, 20 May 2003 14:10:45 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from merlot.juniper.net (natint.juniper.net [207.17.136.129]) by segue.merit.edu (Postfix) with ESMTP id B3CE75DF0F for <idr@merit.edu>; Tue, 20 May 2003 14:10:44 -0400 (EDT)
Received: from juniper.net (garnet.juniper.net [172.17.28.17]) by merlot.juniper.net (8.11.3/8.11.3) with ESMTP id h4KIAbu27841; Tue, 20 May 2003 11:10:37 -0700 (PDT) (envelope-from yakov@juniper.net)
Message-Id: <200305201810.h4KIAbu27841@merlot.juniper.net>
To: Russ White <riw@cisco.com>
Cc: idr@merit.edu, rtg-dir@ietf.org
Subject: Re: Comments on BGP Draft 20..... 
In-Reply-To: Your message of "Tue, 20 May 2003 13:44:16 EDT." <Pine.OSX.4.51.0305201307370.23356@dhcp-64-102-48-215.cisco.com> 
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-ID: <68386.1053454237.1@juniper.net>
Date: Tue, 20 May 2003 11:10:37 -0700
From: Yakov Rekhter <yakov@juniper.net>
Sender: owner-idr@merit.edu
Precedence: bulk

Russ,

> > > This information is sufficeint to construct a graph of the AS connectivity
> > > from which routing loops may be pruned and some policy decisions at the AS
> > > level may be enforced.
> > >
> > > UPDATE Message Format:
> > >
> > > The information in the UPDATE message can be used to construct a graph
> > > describing the relationships of the various Autonomous Systems.
> > >
> > > In both cases this is true, I suppose, but in neither case does this 
> > > really describe what the AS Path is used for, right?
> >
> > Wrong. As the abstract states quite clearly that this information is used
> > to prune routing loops and make some policy decisions at the AS level.
> 
> Yes, but it's not used for inscribing a graph of interconnectivity between
> the AS' in the internetwork, is it? It can be used for that, I suppose, so
> the text is fine, but it's not used for that--I think that's what's
> confusing about it.
> 
> > > 3.2 Routing Information Bases
> > >
> > > b) Loc-RIB....
> > >
> > > I think it might be useful to state the contents of the Loc-RIB are
> > > actually installed in the local routing table, and thus used for forwardi
ng
> > > packets on this router. I don't see anyplace this connection is made
> > > explicit, it seems more like it's implicit throughout the doc.
> >
> > Please propose the text.
> 
> Okay:
> 
> The Loc-RIB contains routes which are installed in the local routing table
> and used for forwarding packets received, based on the destination address,
> by the router.

In the absence of any objections within a week I'll put this in the text.

> > > Network Layer Reachability Information
> > >
> > > "An UPDATE message can list multiple routes to be withdrawn...."
> > >
> > > Actually, we don't withdraw routes, we withdraw prefixes, right? The next
> > > paragraph shows this confusion, by talking about routes without attribute
s,
> > > but routes are prefixes combined with attributes, so.... They aren't
> > > routes, they're prefixes. You remove routes based on withdrawn prefixes, 
I
> > > think.
> >
> > We withdraw routes. The way BGP withdraws routes is by advertising
> > the NLRI field of these routes in the Withdrawn Routes field of
> > the UPDATE message. And that is precisely what the text said:
> >
> >    An UPDATE message can list multiple routes to be withdrawn from service.
> >    Each such route is identified by its destination (expressed as an IP
> >    prefix), which unambiguously identifies the route in the context of the
> >    BGP speaker - BGP speaker connection to which it has been previously
> >    advertised.
> 
> Hmmm... So, if you receive an update with no attributes, just prefixes in
> the withdrawn section, you won't consider that a withdraw, and remove the
> routes you have from the sending peer from the local tables?
> 
> A route without the attributes is a prefix. :-)
> 
> It depends on whether you are thinking of it in terms of what you're
> sending, or what you're causing on the receiver.
> 
> > > 5.1.1 ORIGIN
> > >
> > > "Its value SHOULD NOT be changed by any other speaker."
> > >
> > > I really think this should be "MUST NOT." I can't think of any reason it
> > > wouldn't be, except in the case of aggregation, and that case could be
> > > mentioned here as the only known exception (?).
> >
> > Please propose the text.
> 
> It's value MUST NOT be changed by any other speaker.


In the absence of any objections within a week I'll put this in the text.

Yakov.


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id OAA09194 for <idr-archive@nic.merit.edu>; Tue, 20 May 2003 14:09:23 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id CAC6091267; Tue, 20 May 2003 14:08:56 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 9C60491268; Tue, 20 May 2003 14:08:56 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id EC54791267 for <idr@trapdoor.merit.edu>; Tue, 20 May 2003 14:08:54 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id CFDDC5DEE6; Tue, 20 May 2003 14:08:54 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from rtp-core-1.cisco.com (rtp-core-1.cisco.com [64.102.124.12]) by segue.merit.edu (Postfix) with ESMTP id 99C275DEE0 for <idr@merit.edu>; Tue, 20 May 2003 14:08:54 -0400 (EDT)
Received: from cisco.com (uzura.cisco.com [64.102.17.77]) by rtp-core-1.cisco.com (8.12.9/8.12.6) with ESMTP id h4KI8LkL002486; Tue, 20 May 2003 14:08:21 -0400 (EDT)
Received: from dhcp-64-102-60-183.cisco.com (dhcp-64-102-60-183.cisco.com [64.102.60.183]) by cisco.com (8.8.8/2.6/Cisco List Logging/8.8.8) with ESMTP id OAA19469; Tue, 20 May 2003 14:08:19 -0400 (EDT)
Date: Tue, 20 May 2003 14:08:18 -0400 (EDT)
From: Russ White <ruwhite@cisco.com>
Reply-To: Russ White <riw@cisco.com>
To: Jeffrey Haas <jhaas@nexthop.com>
Cc: Yakov Rekhter <yakov@juniper.net>, idr@merit.edu, rtg-dir@ietf.org
Subject: Re: Comments on BGP Draft 20.....
In-Reply-To: <20030520140512.D16646@nexthop.com>
Message-ID: <Pine.OSX.4.51.0305201406430.8886@dhcp-64-102-60-183.cisco.com>
References: <200305201635.h4KGZ7u19549@merlot.juniper.net> <Pine.OSX.4.51.0305201307370.23356@dhcp-64-102-48-215.cisco.com> <20030520140512.D16646@nexthop.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-idr@merit.edu
Precedence: bulk

> > Yes, but it's not used for inscribing a graph of interconnectivity between
> > the AS' in the internetwork, is it? It can be used for that, I suppose, so
> > the text is fine, but it's not used for that--I think that's what's
> > confusing about it.
>
> Perhaps to elaborate on Russ's point, the AS Path gives us the
> graph for this prefix.  Even with a collection of a bunch of routes,
> we're not guaranteed to have the Internet's AS graph.
>
> The text is a *little* vague in this context, but I can't think
> of better wording.

I agree--I've been trying to come up with a better wayh of working it, but
I can't think of one. I'd say it's too much of a nit to worry about it .

:-)

Russ


__________________________________
riw@cisco.com CCIE <>< Grace Alone



Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id OAA09123 for <idr-archive@nic.merit.edu>; Tue, 20 May 2003 14:06:30 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 418D391266; Tue, 20 May 2003 14:06:08 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 112F991267; Tue, 20 May 2003 14:06:07 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id DEF0591266 for <idr@trapdoor.merit.edu>; Tue, 20 May 2003 14:06:06 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id C0DD55DEE6; Tue, 20 May 2003 14:06:06 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from presque.nexthop.com (dns.nexthop.com [65.247.36.216]) by segue.merit.edu (Postfix) with ESMTP id 5C3B35DF09 for <idr@merit.edu>; Tue, 20 May 2003 14:06:06 -0400 (EDT)
Received: (from root@localhost) by presque.nexthop.com (8.12.8/8.11.1) id h4KI5NB6025817; Tue, 20 May 2003 14:05:23 -0400 (EDT) (envelope-from jhaas@jhaas.nexthop.com)
Received: from jhaas.nexthop.com (jhaas.nexthop.com [65.247.36.31]) by presque.nexthop.com (8.12.8/8.12.8) with ESMTP id h4KI5HWB025810; Tue, 20 May 2003 14:05:19 -0400 (EDT) (envelope-from jhaas@jhaas.nexthop.com)
Received: (from jhaas@localhost) by jhaas.nexthop.com (8.11.3nb1/8.11.3) id h4KI5Cl17752; Tue, 20 May 2003 14:05:12 -0400 (EDT)
Date: Tue, 20 May 2003 14:05:12 -0400
From: Jeffrey Haas <jhaas@nexthop.com>
To: Russ White <riw@cisco.com>
Cc: Yakov Rekhter <yakov@juniper.net>, idr@merit.edu, rtg-dir@ietf.org
Subject: Re: Comments on BGP Draft 20.....
Message-ID: <20030520140512.D16646@nexthop.com>
References: <200305201635.h4KGZ7u19549@merlot.juniper.net> <Pine.OSX.4.51.0305201307370.23356@dhcp-64-102-48-215.cisco.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <Pine.OSX.4.51.0305201307370.23356@dhcp-64-102-48-215.cisco.com>; from ruwhite@cisco.com on Tue, May 20, 2003 at 01:44:16PM -0400
X-Virus-Scanned: by AMaViS perl-11
Sender: owner-idr@merit.edu
Precedence: bulk

On Tue, May 20, 2003 at 01:44:16PM -0400, Russ White wrote:
> Yes, but it's not used for inscribing a graph of interconnectivity between
> the AS' in the internetwork, is it? It can be used for that, I suppose, so
> the text is fine, but it's not used for that--I think that's what's
> confusing about it.

Perhaps to elaborate on Russ's point, the AS Path gives us the 
graph for this prefix.  Even with a collection of a bunch of routes,
we're not guaranteed to have the Internet's AS graph.

The text is a *little* vague in this context, but I can't think
of better wording.


-- 
Jeff Haas 
NextHop Technologies


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id NAA09034 for <idr-archive@nic.merit.edu>; Tue, 20 May 2003 13:45:15 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id DED1091253; Tue, 20 May 2003 13:44:52 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id B072D91265; Tue, 20 May 2003 13:44:52 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id F232091253 for <idr@trapdoor.merit.edu>; Tue, 20 May 2003 13:44:50 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id DA41F5DED9; Tue, 20 May 2003 13:44:50 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from rtp-core-2.cisco.com (rtp-core-2.cisco.com [64.102.124.13]) by segue.merit.edu (Postfix) with ESMTP id 77D6E5DED8 for <idr@merit.edu>; Tue, 20 May 2003 13:44:50 -0400 (EDT)
Received: from cisco.com (uzura.cisco.com [64.102.17.77]) by rtp-core-2.cisco.com (8.12.9/8.12.6) with ESMTP id h4KHiGJh027888; Tue, 20 May 2003 13:44:17 -0400 (EDT)
Received: from dhcp-64-102-60-183.cisco.com (dhcp-64-102-60-183.cisco.com [64.102.60.183]) by cisco.com (8.8.8/2.6/Cisco List Logging/8.8.8) with ESMTP id NAA17560; Tue, 20 May 2003 13:44:16 -0400 (EDT)
Date: Tue, 20 May 2003 13:44:16 -0400 (EDT)
From: Russ White <ruwhite@cisco.com>
Reply-To: Russ White <riw@cisco.com>
To: Yakov Rekhter <yakov@juniper.net>
Cc: idr@merit.edu, rtg-dir@ietf.org
Subject: Re: Comments on BGP Draft 20..... 
In-Reply-To: <200305201635.h4KGZ7u19549@merlot.juniper.net>
Message-ID: <Pine.OSX.4.51.0305201307370.23356@dhcp-64-102-48-215.cisco.com>
References: <200305201635.h4KGZ7u19549@merlot.juniper.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-idr@merit.edu
Precedence: bulk

> > This information is sufficeint to construct a graph of the AS connectivity
> > from which routing loops may be pruned and some policy decisions at the AS
> > level may be enforced.
> >
> > UPDATE Message Format:
> >
> > The information in the UPDATE message can be used to construct a graph
> > describing the relationships of the various Autonomous Systems.
> >
> > In both cases this is true, I suppose, but in neither case does this really
> > describe what the AS Path is used for, right?
>
> Wrong. As the abstract states quite clearly that this information is used
> to prune routing loops and make some policy decisions at the AS level.

Yes, but it's not used for inscribing a graph of interconnectivity between
the AS' in the internetwork, is it? It can be used for that, I suppose, so
the text is fine, but it's not used for that--I think that's what's
confusing about it.

> > 3.2 Routing Information Bases
> >
> > b) Loc-RIB....
> >
> > I think it might be useful to state the contents of the Loc-RIB are
> > actually installed in the local routing table, and thus used for forwarding
> > packets on this router. I don't see anyplace this connection is made
> > explicit, it seems more like it's implicit throughout the doc.
>
> Please propose the text.

Okay:

The Loc-RIB contains routes which are installed in the local routing table
and used for forwarding packets received, based on the destination address,
by the router.

> > Network Layer Reachability Information
> >
> > "An UPDATE message can list multiple routes to be withdrawn...."
> >
> > Actually, we don't withdraw routes, we withdraw prefixes, right? The next
> > paragraph shows this confusion, by talking about routes without attributes,
> > but routes are prefixes combined with attributes, so.... They aren't
> > routes, they're prefixes. You remove routes based on withdrawn prefixes, I
> > think.
>
> We withdraw routes. The way BGP withdraws routes is by advertising
> the NLRI field of these routes in the Withdrawn Routes field of
> the UPDATE message. And that is precisely what the text said:
>
>    An UPDATE message can list multiple routes to be withdrawn from service.
>    Each such route is identified by its destination (expressed as an IP
>    prefix), which unambiguously identifies the route in the context of the
>    BGP speaker - BGP speaker connection to which it has been previously
>    advertised.

Hmmm... So, if you receive an update with no attributes, just prefixes in
the withdrawn section, you won't consider that a withdraw, and remove the
routes you have from the sending peer from the local tables?

A route without the attributes is a prefix. :-)

It depends on whether you are thinking of it in terms of what you're
sending, or what you're causing on the receiver.

> > 5.1.1 ORIGIN
> >
> > "Its value SHOULD NOT be changed by any other speaker."
> >
> > I really think this should be "MUST NOT." I can't think of any reason it
> > wouldn't be, except in the case of aggregation, and that case could be
> > mentioned here as the only known exception (?).
>
> Please propose the text.

It's value MUST NOT be changed by any other speaker.

:-)

Russ


__________________________________
riw@cisco.com CCIE <>< Grace Alone



Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id MAA08614 for <idr-archive@nic.merit.edu>; Tue, 20 May 2003 12:35:49 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id DBB3D91261; Tue, 20 May 2003 12:35:26 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id A119491262; Tue, 20 May 2003 12:35:26 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 238C691261 for <idr@trapdoor.merit.edu>; Tue, 20 May 2003 12:35:25 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 07FD85DE63; Tue, 20 May 2003 12:35:25 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from merlot.juniper.net (natint.juniper.net [207.17.136.129]) by segue.merit.edu (Postfix) with ESMTP id 7DDDF5DE62 for <idr@merit.edu>; Tue, 20 May 2003 12:35:24 -0400 (EDT)
Received: from juniper.net (garnet.juniper.net [172.17.28.17]) by merlot.juniper.net (8.11.3/8.11.3) with ESMTP id h4KGZ7u19549; Tue, 20 May 2003 09:35:07 -0700 (PDT) (envelope-from yakov@juniper.net)
Message-Id: <200305201635.h4KGZ7u19549@merlot.juniper.net>
To: Russ White <riw@cisco.com>
Cc: idr@merit.edu, rtg-dir@ietf.org
Subject: Re: Comments on BGP Draft 20..... 
In-Reply-To: Your message of "Fri, 09 May 2003 10:13:14 EDT." <Pine.WNT.4.53.0305090945390.2372@russpc> 
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-ID: <26545.1053448507.1@juniper.net>
Date: Tue, 20 May 2003 09:35:07 -0700
From: Yakov Rekhter <yakov@juniper.net>
Sender: owner-idr@merit.edu
Precedence: bulk

Russ,

> 
> Some of these are going to echo Alex's comments, but that's okay, I think.
> Mostly just nits....
> 
> :-)

Thanks for the comments. My response is in-line...

> 
> Russ
> 
> __________________________________
> riw@cisco.com CCIE <>< Grace Alone
> 
> -----
> 
> Abstract:
> 
> This information is sufficeint to construct a graph of the AS connectivity
> from which routing loops may be pruned and some policy decisions at the AS
> level may be enforced.
> 
> UPDATE Message Format:
> 
> The information in the UPDATE message can be used to construct a graph
> describing the relationships of the various Autonomous Systems.
> 
> In both cases this is true, I suppose, but in neither case does this really
> describe what the AS Path is used for, right? 

Wrong. As the abstract states quite clearly that this information is used
to prune routing loops and make some policy decisions at the AS level.

> I would think we'd want to
> describe it less in terms of a "graph of the connectivity in the
> internetwork," and more in terms of "a graph of the path through Autonomous
> Systems ued to reach the destination advertised." It could be confusing,
> since there isn't anyplace where we discuss building a graph of
> inconnectivity between the Autonomous Systems....
> 
> -----
> 
> Forwarding Paradigm:
> 
> This document uses the term "Autonomous System" (AS)  throughout....
> 
> This entire paragraph is a repeat--I'd leave it just in the definitions.

The definition section suppose to have a *summary* of the definitions
used in the spec.

> -----
> 
> Forwarding Paradigms:
> 
> The initial data flow....
> 
> This paragraph has two different thoughts in it, one about incremental
> updates, and the other about keeping data that you've received. It seems
> like just putting a return after "as the routing tables change."

The two are related, as the reason for keeing updates you've received
is because the exchange of information is based on incremental updates.

> -----
> 
> Forwarding Paradigms:
> 
> The paragraph starting "KEEPALIVE messages" should, I think, be moved up
> above the section on route exchange. I don't know why, it just seems less
> like it's jumping all over the place that way.

Disagree.

> -----
> 
> 3.1 Routes: Advertisement and Storage
> 
> It almost seems like the section about The initial data flow should maybe
> be put entirely under this section someplace (?).
> 
> The first paragraph in this section is really a definition of a route vs a
> prefix, and should probably be in the definitions.

see above.

> The paragraph "Changing attribute of a route...." needs a "the," or
> attribute needs an "s."

Ok.

> -----
> 
> 3.2 Routing Information Bases
> 
> b) Loc-RIB....
> 
> I think it might be useful to state the contents of the Loc-RIB are
> actually installed in the local routing table, and thus used for forwarding
> packets on this router. I don't see anyplace this connection is made
> explicit, it seems more like it's implicit throughout the doc.

Please propose the text.

> -----
> 
> Page 18, a) LOCAL_PREF
> 
> "....to inform other peers...." should be "....to inform its other
> peers...."

Sure.

> -----
> 
> Network Layer Reachability Information
> 
> "This varibale length field contains a list of IP address prefixes."
> 
> I think we can kill "address" here.

Sure.

> 
> a) Length
> 
> "The Length field inidicates...." The sentence can start with
> "Indicates..."

I prefer to keep the current text.

> 
> b) Prefix
> 
> "The Prefix field indicates...." The sentence can start with
> "Indicates...."

Ditto.

> 
> -----
> 
> Network Layer Reachability Information
> 
> "An UPDATE message can list multiple routes to be withdrawn...."
> 
> Actually, we don't withdraw routes, we withdraw prefixes, right? The next
> paragraph shows this confusion, by talking about routes without attributes,
> but routes are prefixes combined with attributes, so.... They aren't
> routes, they're prefixes. You remove routes based on withdrawn prefixes, I
> think.

We withdraw routes. The way BGP withdraws routes is by advertising
the NLRI field of these routes in the Withdrawn Routes field of
the UPDATE message. And that is precisely what the text said:

   An UPDATE message can list multiple routes to be withdrawn from service.
   Each such route is identified by its destination (expressed as an IP
   prefix), which unambiguously identifies the route in the context of the
   BGP speaker - BGP speaker connection to which it has been previously
   advertised.


> 
> ------
> 
> 5. Path Attributes
> 
> "Well-known attributes MUST be recognized by all BGP implementations."
> 
> This sentence, as strange as it may sound, implies it's the attributes
> fault if the BGP implementation doesn't recogonize it, that it's up to the
> attribute definers to, in some way, make certain that BGP implementations
> will recognize it. I think it should probably be worded the other way
> 'round:
> 
> "BGP implementations MUST recognize all well-known attributes."

Sure.

> -----
> 
> 5. Path Attributes
> 
> "All well-known attributes MUST be passed along (after proper updating, if
> necessary) to other BGP peers."
> 
> This just seems a little rough. Maybe this:
> 
> "Once a BGP peer has updated any well-known attributes, it MUST pass these
> attributes in any updates it transmits to its peers."

Sure.

> 
> -----
> 
> 5.1.1 ORIGIN
> 
> "Its value SHOULD NOT be changed by any other speaker."
> 
> I really think this should be "MUST NOT." I can't think of any reason it
> wouldn't be, except in the case of aggregation, and that case could be
> mentioned here as the only known exception (?).

Please propose the text.

Yakov.


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id AAA19599 for <idr-archive@nic.merit.edu>; Mon, 19 May 2003 00:05:08 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id AC52F91208; Mon, 19 May 2003 00:04:49 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 6DD379124E; Mon, 19 May 2003 00:04:49 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 4FDDB91208 for <idr@trapdoor.merit.edu>; Mon, 19 May 2003 00:03:16 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 0A6665DE35; Mon, 19 May 2003 00:03:16 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from mailout3.samsung.com (u33.gpu114.samsung.co.kr [203.254.224.33]) by segue.merit.edu (Postfix) with ESMTP id 5012F5DE27 for <idr@merit.edu>; Mon, 19 May 2003 00:03:15 -0400 (EDT)
Received: from custom-daemon.mailout3.samsung.com by mailout3.samsung.com (iPlanet Messaging Server 5.2 HotFix 1.05 (built Nov  6 2002)) id <0HF4007018LCDZ@mailout3.samsung.com> for idr@merit.edu; Mon, 19 May 2003 13:03:12 +0900 (KST)
Received: from ep_mmp1 (localhost [127.0.0.1]) by mailout3.samsung.com (iPlanet Messaging Server 5.2 HotFix 1.05 (built Nov 6 2002)) with ESMTP id <0HF4004HD8LBTT@mailout3.samsung.com> for idr@merit.edu; Mon, 19 May 2003 13:03:12 +0900 (KST)
Received: from Manav ([107.108.3.180]) by mmp1.samsung.com (iPlanet Messaging Server 5.2 HotFix 1.05 (built Nov  6 2002)) with ESMTPA id <0HF400CE58LAZE@mmp1.samsung.com> for idr@merit.edu; Mon, 19 May 2003 13:03:11 +0900 (KST)
Date: Mon, 19 May 2003 09:29:21 +0530
From: Manav Bhatia <manav@samsung.com>
Subject: Re: FW: I-D ACTION:draft-bhatia-ecmp-routes-in-bgp-00.txt
To: Mareline Sheldon <marelines@yahoo.com>
Cc: idr@merit.edu
Reply-To: Manav Bhatia <manav@samsung.com>
Message-id: <00b701c31dbb$082729a0$b4036c6b@sisodomain.com>
MIME-version: 1.0
X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2800.1165
X-Mailer: Microsoft Outlook Express 6.00.2800.1158
Content-type: text/plain; charset=iso-8859-1
Content-transfer-encoding: 7BIT
X-Priority: 3
X-MSMail-priority: Normal
References: <20030518042424.71655.qmail@web20310.mail.yahoo.com>
Sender: owner-idr@merit.edu
Precedence: bulk

Mareline,
The design implicitly takes care of the scenario you explain.  Though I
confess that I have not been really clear on this in this version of the
draft.

To advertise two ECMP routes with different attributes we will use two
UPDATEs where each will be sent with a blank ECMP_NEXT_HOP attribute (nos.
of next-hops will be kept as zero). The receiver upon receiving such
UPDATEs will know that since the ECMP_NEXT_HOP attribute is present in the
UPDATE, it needs to be added in addition to what it has already.

I guess the following text needs to be added in the draft.

The receiver SHOULD not remove any previous route and add the route
received with an ECMP_NEXT_HOP attribute rather than replace the previous
routes.

When advertising more than one ECMP hop with identical attributes the
sender SHOULD send a single update with multiple hops listed in the
ECMP_NEXT_HOP attribute.

When advertising more than one ECMP hop which do not have identical
attributes multiple BGP updates MUST be sent with the ECMP_NEXT_HOP
attribute included to suppress route replacement.

But a more important question is that whether we need this kind of
mechanism in BGP or not. We already have multiple drafts proposed which
strive to achieve similar goals using different techniques and mechanisms.
I guess one motivation being, to allow inter-operatibility between
different vendors to allow advertisement of multiple BGP paths of same
preference.

Once we're through with the above discussion, we can sit down and look into
the nitty-gritties of each proposal.

Regards,
Manav



----- Original Message ----- 
From: "Mareline Sheldon" <marelines@yahoo.com>
To: "Manav Bhatia" <manav@samsung.com>
Cc: <idr@merit.edu>
Sent: Sunday, May 18, 2003 9:54 AM
Subject: Re: FW: I-D ACTION:draft-bhatia-ecmp-routes-in-bgp-00.txt


> Manav,
> Can we using this draft advertise two ECMP routes of equal preference but
say, with different
> AS Paths. As far as i could gather you use one additional attribute to
describe equal cost
> routes using the same path attributes. This way you can definitely
advertise routes with all
> the same attributes. But what happens say when one of my path attributes
are different. Eg. AS
> PATH 112 123 in one attribute and AS PATH 564 232 in the other?
>
> Can this be done here?
>
> Regards,
> Mareline S.
>
> --- Manav Bhatia <manav@samsung.com> wrote:
> > Hi,
> > Please look into this new Internet draft which is available from the
> > on-line Internet-Drafts directories.
> >
> > I-D ACTION:draft-bhatia-ecmp-routes-in-bgp-00.txt
> > To: IETF-Announce: ;
> > Subject: I-D ACTION:draft-bhatia-ecmp-routes-in-bgp-00.txt
> > From: Internet-Drafts@ietf.org
> > Date: Thu, 15 May 2003 07:19:32 -0400
> > Reply-to: Internet-Drafts@ietf.org
> > Sender: owner-ietf-announce@ietf.org
> >
> > A New Internet-Draft is available
> >
> > Title  : Advertising Equal Cost Multi-Path (ECMP) routes in BGP
> > Author(s) : M. Bhatia
> > Filename : draft-bhatia-ecmp-routes-in-bgp-00.txt
> > Pages  : 7
> > Date  : 2003-5-14
> >
> > This document describes an extensible mechanism that will allow a BGP
> > [BGP4] speaker to advertise equal cost multi-path (ECMP) routes for a
> > destination to its peers without changing the semantics of the UPDATE
> > message.
> >
> > A new BGP attribute is introduced that will be used to advertise the
> > multiple next hops for the feasible and the un-feasible ECMP BGP routes
to
> > the remote peers.
> >
> > The mechanisms described in this document are applicable to all
routers,
> > both those with the ability to inject multiple routing  entries in
their
> > forwarding table and those without (although the latter need not
implement
> > some extensions described in this document).
> >
> > A URL for this Internet-Draft is:
> >
http://www.ietf.org/internet-drafts/draft-bhatia-ecmp-routes-in-bgp-00.txt
> >
> >
> > To remove yourself from the IETF Announcement list, send a message to
> > ietf-announce-request with the word unsubscribe in the body of the
message.
> >
> > Internet-Drafts are also available by anonymous FTP. Login with the
> > username "anonymous" and a password of your e-mail address. After
logging
> > in, type "cd internet-drafts" and then
> >  "get draft-bhatia-ecmp-routes-in-bgp-00.txt".
> >
> >
> >
>
>
> __________________________________
> Do you Yahoo!?
> The New Yahoo! Search - Faster. Easier. Bingo.
> http://search.yahoo.com
>



Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id AAA10768 for <idr-archive@nic.merit.edu>; Sun, 18 May 2003 00:24:46 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 0132F91246; Sun, 18 May 2003 00:24:27 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id BABDA91248; Sun, 18 May 2003 00:24:26 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 9886791246 for <idr@trapdoor.merit.edu>; Sun, 18 May 2003 00:24:25 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 713DF5E40A; Sun, 18 May 2003 00:24:25 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from web20310.mail.yahoo.com (web20310.mail.yahoo.com [216.136.226.91]) by segue.merit.edu (Postfix) with SMTP id 065695E406 for <idr@merit.edu>; Sun, 18 May 2003 00:24:25 -0400 (EDT)
Message-ID: <20030518042424.71655.qmail@web20310.mail.yahoo.com>
Received: from [219.65.142.150] by web20310.mail.yahoo.com via HTTP; Sat, 17 May 2003 21:24:24 PDT
Date: Sat, 17 May 2003 21:24:24 -0700 (PDT)
From: Mareline Sheldon <marelines@yahoo.com>
Subject: Re: FW: I-D ACTION:draft-bhatia-ecmp-routes-in-bgp-00.txt
To: Manav Bhatia <manav@samsung.com>
Cc: idr@merit.edu
In-Reply-To: <068e01c31ae3$20cfe620$b4036c6b@sisodomain.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Sender: owner-idr@merit.edu
Precedence: bulk

Manav,
Can we using this draft advertise two ECMP routes of equal preference but say, with different
AS Paths. As far as i could gather you use one additional attribute to describe equal cost
routes using the same path attributes. This way you can definitely advertise routes with all
the same attributes. But what happens say when one of my path attributes are different. Eg. AS
PATH 112 123 in one attribute and AS PATH 564 232 in the other?

Can this be done here?

Regards,
Mareline S.

--- Manav Bhatia <manav@samsung.com> wrote:
> Hi,
> Please look into this new Internet draft which is available from the
> on-line Internet-Drafts directories.
> 
> I-D ACTION:draft-bhatia-ecmp-routes-in-bgp-00.txt
> To: IETF-Announce: ;
> Subject: I-D ACTION:draft-bhatia-ecmp-routes-in-bgp-00.txt
> From: Internet-Drafts@ietf.org
> Date: Thu, 15 May 2003 07:19:32 -0400
> Reply-to: Internet-Drafts@ietf.org
> Sender: owner-ietf-announce@ietf.org
> 
> A New Internet-Draft is available
> 
> Title  : Advertising Equal Cost Multi-Path (ECMP) routes in BGP
> Author(s) : M. Bhatia
> Filename : draft-bhatia-ecmp-routes-in-bgp-00.txt
> Pages  : 7
> Date  : 2003-5-14
> 
> This document describes an extensible mechanism that will allow a BGP
> [BGP4] speaker to advertise equal cost multi-path (ECMP) routes for a
> destination to its peers without changing the semantics of the UPDATE
> message.
> 
> A new BGP attribute is introduced that will be used to advertise the
> multiple next hops for the feasible and the un-feasible ECMP BGP routes to
> the remote peers.
> 
> The mechanisms described in this document are applicable to all routers,
> both those with the ability to inject multiple routing  entries in their
> forwarding table and those without (although the latter need not implement
> some extensions described in this document).
> 
> A URL for this Internet-Draft is:
> http://www.ietf.org/internet-drafts/draft-bhatia-ecmp-routes-in-bgp-00.txt
> 
> 
> To remove yourself from the IETF Announcement list, send a message to
> ietf-announce-request with the word unsubscribe in the body of the message.
> 
> Internet-Drafts are also available by anonymous FTP. Login with the
> username "anonymous" and a password of your e-mail address. After logging
> in, type "cd internet-drafts" and then
>  "get draft-bhatia-ecmp-routes-in-bgp-00.txt".
> 
> 
> 


__________________________________
Do you Yahoo!?
The New Yahoo! Search - Faster. Easier. Bingo.
http://search.yahoo.com


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id MAA28927 for <idr-archive@nic.merit.edu>; Thu, 15 May 2003 12:54:06 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 1AC7B9130E; Thu, 15 May 2003 12:53:00 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id F226B9134F; Thu, 15 May 2003 12:52:53 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id C8CC79135D for <idr@trapdoor.merit.edu>; Thu, 15 May 2003 12:52:29 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 7F3E85DF5C; Thu, 15 May 2003 12:52:29 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from workhorse.fictitious.org (workhorse.fictitious.org [209.150.1.230]) by segue.merit.edu (Postfix) with ESMTP id AC0F05DF60 for <idr@merit.edu>; Thu, 15 May 2003 12:52:28 -0400 (EDT)
Received: from workhorse.fictitious.org (localhost.fictitious.org [127.0.0.1]) by workhorse.fictitious.org (8.9.3/8.9.3) with ESMTP id MAA85953; Thu, 15 May 2003 12:50:41 -0400 (EDT) (envelope-from curtis@workhorse.fictitious.org)
Message-Id: <200305151650.MAA85953@workhorse.fictitious.org>
To: Mireille Shammas <mireille.shammas@Alcatel.com>
Cc: Manav Bhatia <manav@samsung.com>, idr@merit.edu
Reply-To: curtis@fictitious.org
Subject: Re: FW: I-D ACTION:draft-bhatia-ecmp-routes-in-bgp-00.txt 
In-reply-to: Your message of "Thu, 15 May 2003 10:38:44 EDT." <3EC3A674.7142D847@alcatel.com> 
Date: Thu, 15 May 2003 12:50:40 -0400
From: Curtis Villamizar <curtis@fictitious.org>
Sender: owner-idr@merit.edu
Precedence: bulk

In message <3EC3A674.7142D847@alcatel.com>, Mireille Shammas writes:
> Hi Manav,
> In this draft you don't mention anything about ECMP in a BGP/MPLS VPN network
> and more precisely ECMP between PE and CE where EBGP is used. Note that IBGP 
> is
> used to relay the MPLS label information between PEs. I think this will total
> ly
> depend on how you choose the labels on a local PE to advertise to a remote PE
> to carry the VPN traffic back to the CE. Sample scenario below:
> 
> |      |-------ebgp session--------|
> |                                                        |       |
> | CE|-------ebgp session--------| PE1| =======IBGP/MPLS===========|PE2| .....
> ..
> 
> |      |-------ebgp session--------|
> |                                                        |       |
> 
>  I am mainly interested in what will happen on PE1, and how to balance
> VPN/labelled traffic coming from PE1 towards CE .
> Thanks
> Mireille


The MPLS LSP ends at PE1.  PE2 doesn't care what happens at the other
end of the LSP.  PE1 is free to do the load split exactly as it does
now without telling PE2 that there are more than one next-hop.

The more interesting (difficult) cases are where PE1 has EBGP peers
that advertise routes with different but equal preference attributes
and it doesn't matter at all whether MPLS is in use.

The really difficult (and common) case is where equal cost routes are
advertised by two border routers into the IBGP mesh and a router
internal to the mesh splits among the two routers.  This router does
not advertise anything to BGP about this route and therefore cannot
advertise it as a multipath.

Curtis


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id KAA25054 for <idr-archive@nic.merit.edu>; Thu, 15 May 2003 10:39:11 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id D0A54912BF; Thu, 15 May 2003 10:38:48 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 9C050912C0; Thu, 15 May 2003 10:38:48 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 6399A912BF for <idr@trapdoor.merit.edu>; Thu, 15 May 2003 10:38:47 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 4EB7E5DF04; Thu, 15 May 2003 10:38:47 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from kanmx2.ca.alcatel.com (kanfw1.ottawa.alcatel.ca [192.75.23.69]) by segue.merit.edu (Postfix) with SMTP id B43155DEE1 for <idr@merit.edu>; Thu, 15 May 2003 10:38:46 -0400 (EDT)
Received: (qmail 22762 invoked from network); 15 May 2003 14:41:14 -0000
Received: from unknown (HELO alcatel.com) (138.120.105.202) by kanmx2.ca.alcatel.com with SMTP; 15 May 2003 14:41:14 -0000
Message-ID: <3EC3A674.7142D847@alcatel.com>
Date: Thu, 15 May 2003 10:38:44 -0400
From: Mireille Shammas <mireille.shammas@alcatel.com>
Organization: Alcatel CID
X-Mailer: Mozilla 4.79 [en] (Windows NT 5.0; U)
X-Accept-Language: en
MIME-Version: 1.0
To: Manav Bhatia <manav@samsung.com>
Cc: idr@merit.edu
Subject: Re: FW: I-D ACTION:draft-bhatia-ecmp-routes-in-bgp-00.txt
References: <068e01c31ae3$20cfe620$b4036c6b@sisodomain.com>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-idr@merit.edu
Precedence: bulk

Hi Manav,
In this draft you don't mention anything about ECMP in a BGP/MPLS VPN network
and more precisely ECMP between PE and CE where EBGP is used. Note that IBGP is
used to relay the MPLS label information between PEs. I think this will totally
depend on how you choose the labels on a local PE to advertise to a remote PE
to carry the VPN traffic back to the CE. Sample scenario below:

|      |-------ebgp session--------|
|                                                        |       |
| CE|-------ebgp session--------| PE1| =======IBGP/MPLS===========|PE2| .......

|      |-------ebgp session--------|
|                                                        |       |

 I am mainly interested in what will happen on PE1, and how to balance
VPN/labelled traffic coming from PE1 towards CE .
Thanks
Mireille


Manav Bhatia wrote:

> Hi,
> Please look into this new Internet draft which is available from the
> on-line Internet-Drafts directories.
>
> I-D ACTION:draft-bhatia-ecmp-routes-in-bgp-00.txt
> To: IETF-Announce: ;
> Subject: I-D ACTION:draft-bhatia-ecmp-routes-in-bgp-00.txt
> From: Internet-Drafts@ietf.org
> Date: Thu, 15 May 2003 07:19:32 -0400
> Reply-to: Internet-Drafts@ietf.org
> Sender: owner-ietf-announce@ietf.org
>
> A New Internet-Draft is available
>
> Title  : Advertising Equal Cost Multi-Path (ECMP) routes in BGP
> Author(s) : M. Bhatia
> Filename : draft-bhatia-ecmp-routes-in-bgp-00.txt
> Pages  : 7
> Date  : 2003-5-14
>
> This document describes an extensible mechanism that will allow a BGP
> [BGP4] speaker to advertise equal cost multi-path (ECMP) routes for a
> destination to its peers without changing the semantics of the UPDATE
> message.
>
> A new BGP attribute is introduced that will be used to advertise the
> multiple next hops for the feasible and the un-feasible ECMP BGP routes to
> the remote peers.
>
> The mechanisms described in this document are applicable to all routers,
> both those with the ability to inject multiple routing  entries in their
> forwarding table and those without (although the latter need not implement
> some extensions described in this document).
>
> A URL for this Internet-Draft is:
> http://www.ietf.org/internet-drafts/draft-bhatia-ecmp-routes-in-bgp-00.txt
>
> To remove yourself from the IETF Announcement list, send a message to
> ietf-announce-request with the word unsubscribe in the body of the message.
>
> Internet-Drafts are also available by anonymous FTP. Login with the
> username "anonymous" and a password of your e-mail address. After logging
> in, type "cd internet-drafts" and then
>  "get draft-bhatia-ecmp-routes-in-bgp-00.txt".



Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id JAA21781 for <idr-archive@nic.merit.edu>; Thu, 15 May 2003 09:13:05 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id A7480912BA; Thu, 15 May 2003 09:12:36 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 74A00912BB; Thu, 15 May 2003 09:12:36 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 27F3A912BA for <idr@trapdoor.merit.edu>; Thu, 15 May 2003 09:12:35 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 150305DECD; Thu, 15 May 2003 09:12:35 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from mailout1.samsung.com (u24.gpu114.samsung.co.kr [203.254.224.24]) by segue.merit.edu (Postfix) with ESMTP id BD3755DECC for <idr@merit.edu>; Thu, 15 May 2003 09:12:34 -0400 (EDT)
Received: from custom-daemon.mailout1.samsung.com by mailout1.samsung.com (iPlanet Messaging Server 5.2 HotFix 1.05 (built Nov  6 2002)) id <0HEX00801JCWG2@mailout1.samsung.com> for idr@merit.edu; Thu, 15 May 2003 22:12:32 +0900 (KST)
Received: from ep_mmp2 (localhost [127.0.0.1]) by mailout1.samsung.com (iPlanet Messaging Server 5.2 HotFix 1.05 (built Nov 6 2002)) with ESMTP id <0HEX00J5RJCVYY@mailout1.samsung.com> for idr@merit.edu; Thu, 15 May 2003 22:12:32 +0900 (KST)
Received: from Manav ([107.108.3.180]) by mmp2.samsung.com (iPlanet Messaging Server 5.2 HotFix 1.05 (built Nov  6 2002)) with ESMTPA id <0HEX001NUJCUKJ@mmp2.samsung.com> for idr@merit.edu; Thu, 15 May 2003 22:12:31 +0900 (KST)
Date: Thu, 15 May 2003 18:38:48 +0530
From: Manav Bhatia <manav@samsung.com>
Subject: FW: I-D ACTION:draft-bhatia-ecmp-routes-in-bgp-00.txt
To: idr@merit.edu
Reply-To: Manav Bhatia <manav@samsung.com>
Message-id: <068e01c31ae3$20cfe620$b4036c6b@sisodomain.com>
MIME-version: 1.0
X-MIMEOLE: Produced By Microsoft MimeOLE V6.00.2800.1165
X-Mailer: Microsoft Outlook Express 6.00.2800.1158
Content-type: text/plain; charset=iso-8859-1
Content-transfer-encoding: 7BIT
X-Priority: 3
X-MSMail-priority: Normal
Sender: owner-idr@merit.edu
Precedence: bulk

Hi,
Please look into this new Internet draft which is available from the
on-line Internet-Drafts directories.

I-D ACTION:draft-bhatia-ecmp-routes-in-bgp-00.txt
To: IETF-Announce: ;
Subject: I-D ACTION:draft-bhatia-ecmp-routes-in-bgp-00.txt
From: Internet-Drafts@ietf.org
Date: Thu, 15 May 2003 07:19:32 -0400
Reply-to: Internet-Drafts@ietf.org
Sender: owner-ietf-announce@ietf.org

A New Internet-Draft is available

Title  : Advertising Equal Cost Multi-Path (ECMP) routes in BGP
Author(s) : M. Bhatia
Filename : draft-bhatia-ecmp-routes-in-bgp-00.txt
Pages  : 7
Date  : 2003-5-14

This document describes an extensible mechanism that will allow a BGP
[BGP4] speaker to advertise equal cost multi-path (ECMP) routes for a
destination to its peers without changing the semantics of the UPDATE
message.

A new BGP attribute is introduced that will be used to advertise the
multiple next hops for the feasible and the un-feasible ECMP BGP routes to
the remote peers.

The mechanisms described in this document are applicable to all routers,
both those with the ability to inject multiple routing  entries in their
forwarding table and those without (although the latter need not implement
some extensions described in this document).

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-bhatia-ecmp-routes-in-bgp-00.txt


To remove yourself from the IETF Announcement list, send a message to
ietf-announce-request with the word unsubscribe in the body of the message.

Internet-Drafts are also available by anonymous FTP. Login with the
username "anonymous" and a password of your e-mail address. After logging
in, type "cd internet-drafts" and then
 "get draft-bhatia-ecmp-routes-in-bgp-00.txt".





Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id SAA26035 for <idr-archive@nic.merit.edu>; Wed, 14 May 2003 18:10:08 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 9C2C2912B2; Wed, 14 May 2003 18:07:57 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 40D40912B3; Wed, 14 May 2003 18:07:57 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 24D03912B2 for <idr@trapdoor.merit.edu>; Wed, 14 May 2003 18:07:49 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 0AC665E241; Wed, 14 May 2003 18:07:49 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from workhorse.fictitious.org (workhorse.fictitious.org [209.150.1.230]) by segue.merit.edu (Postfix) with ESMTP id BD8E75E23D for <idr@merit.edu>; Wed, 14 May 2003 18:07:47 -0400 (EDT)
Received: from workhorse.fictitious.org (localhost.fictitious.org [127.0.0.1]) by workhorse.fictitious.org (8.9.3/8.9.3) with ESMTP id SAA82275; Wed, 14 May 2003 18:06:09 -0400 (EDT) (envelope-from curtis@workhorse.fictitious.org)
Message-Id: <200305142206.SAA82275@workhorse.fictitious.org>
To: David Meyer <dmm@maoz.com>
Cc: Curtis Villamizar <curtis@fictitious.org>, idr@merit.edu
Reply-To: curtis@fictitious.org
Subject: Re: draft-ietf-idr-bgp-analysis-03.txt 
In-reply-to: Your message of "Wed, 14 May 2003 13:43:55 PDT." <20030514134355.A26408@maoz.com> 
Date: Wed, 14 May 2003 18:06:08 -0400
From: Curtis Villamizar <curtis@fictitious.org>
Sender: owner-idr@merit.edu
Precedence: bulk

In message <20030514134355.A26408@maoz.com>, David Meyer writes:
> 
> 	Curtis,
> 	
> 	These are great comments. One point
> 
> >> ------------
> >> 
> >> The following statement is incorrect:
> >> 
> >>    Finally, since the dynamic properties of the network cannot be
> >>    quantitatively bounded, stability must be addressed via heuristics
> >>    such as BGP Route Flap Damping [RFC2439]. Due to the nature of BGP,
> >>    such damping should be viewed as a matter local to an autonomous
> >>    system matter (see also Appendix F.2 of [BGP4]).
> >> 
> >> The amount of change is inherently bounded in BGP (as I described
> >> above).  BGP Route Flap Damping was initially proposed for two
> >> reasons, 1) to protect a specific commercial implementation that was
> >> not sufficiently robust, 2) to improve convergence of stable routes.
> >> BGP Route Flap Damping is not necessary to bound the amount of change
> >> in BPG routing.
> 
> 	Yes, but route flap dampening is just a heuristic that we
> 	use because the dynamics can't be bounded; that's what I
> 	was after. Do you disagree with this?
> 
> 	Dave


I made a comment on that later in my earlier note but I'll provide
more detail.

There were two initial motivations for BGP Route Flap Damping.  One
was a certain BGP implementation wasn't robust when we first started
but was much more robust by the time the RFC came out.  Initially if
you pushed hard enough it would fall over.  The second reason was even
after implementations out there were all quite robust, convergence for
stable routes was a lot slower than we'd like if there were a lot of
unstable routes.  [A third reason was a certain router would drop
packets when it had a route cache implementation but we'd all rather
forget about that design error.]  Anyway these reasons motivated the
idea and kept it going.

BGP is inherently work conserving, meaning that above some amount of
incoming change for a given number of prefixes, the amount of work and
the amount of outgoing change was either bound by the number of
flapping prefixes or constant above some number of prefixes (saturated
CPU).  This amount of work and outgoing (advertised out) churn is
bounded but still quite high.  BGP Route Flap Damping was deployed in
recognition of the fact that a very small percentage of the prefixes
representing an even smaller part of the reachable address space where
contributing most of the churn and the vast majority of prefixes were
quite stable.  That was the practical reason that providers were
willing to turn on BGP Route Flap Damping and have to deal with the
occasional headache of clearing the history for prefixes that they'd
rather not get damped (or where the problem was know to have been
fixed - like when some other NOC called).

[ As a side note, BGP Route Flap Damping was never implemented
correctly.  It should take the AS path into consideration and only
damp an unstable path if another existed that was quite stable, not
damp the prefix.  It was known that per prefix damping would cause
problems so in the spec, the AS path as part of the key field was not
optional.  If it had been implemented correctly we would not have the
problems with BGP Route Flap Damping that we have experienced. ]

My argument amounts to 1) route churn IS bounded, however 2) BGP Route
Flap Damping exists because a) the bounds is uncomfortably high, b)
there used to be some broken implementations in use that were not
sufficiently robust, c) a small percentage of prefixes contributed
most of the churn, d) getting rid of that small percentage was
percieved as good for the Internet at large (the majority of stable
prefixes and reachable destinations).

Curtis


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id NAA16226 for <idr-archive@nic.merit.edu>; Wed, 14 May 2003 13:16:29 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 1F49B9121F; Wed, 14 May 2003 13:15:40 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id D6FF891256; Wed, 14 May 2003 13:15:39 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 40E089121F for <idr@trapdoor.merit.edu>; Wed, 14 May 2003 13:15:38 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 0E5275E156; Wed, 14 May 2003 13:15:37 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from halt-in.cisco.com (halt-in.cisco.com [171.70.144.185]) by segue.merit.edu (Postfix) with ESMTP id 5720D5DF3F for <idr@merit.edu>; Wed, 14 May 2003 13:15:36 -0400 (EDT)
Received: from cisco.com (171.71.163.13) by halt-in.cisco.com with ESMTP; 14 May 2003 10:15:51 -0800
Received: from cisco.com (keyupate-lnx.cisco.com [128.107.165.20]) by mira-sjc5-f.cisco.com (Mirapoint Messaging Server MOS 3.3.3-GR) with ESMTP id AGF19081; Wed, 14 May 2003 10:22:35 -0700 (PDT)
Message-ID: <3EC279B0.1010107@cisco.com>
Date: Wed, 14 May 2003 10:15:28 -0700
From: Keyur Patel <keyupate@cisco.com>
User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.0.1) Gecko/20020823 Netscape/7.0
X-Accept-Language: en-us, en
MIME-Version: 1.0
To: curtis@fictitious.org
Cc: idr@merit.edu, David Meyer <dmm@maoz.com>
Subject: Re: draft-ietf-idr-bgp-analysis-03.txt
References: <200305141408.KAA78532@workhorse.fictitious.org>
Content-Type: text/plain; charset=us-ascii; format=flowed
Content-Transfer-Encoding: 7bit
Sender: owner-idr@merit.edu
Precedence: bulk

Curtis:
    Thanks for your comments. We will incorporate them in next revision.
-Keyur

Curtis Villamizar wrote:

>David, Keyur,
>
>I have some suggestions for improvement to this draft ( BGP-4 Protocol
>Analysis <draft-ietf-idr-bgp-analysis-03.txt>).  See comments inline
>below.  Feel free to take what you consider valid and worth changing.
>
>Curtis
>
>
>In "1. Introduction" you should mention that BGP4 was the first to
>support CIDR and due to their lack of support for CIDR versions 1-3
>are considered obsolete and unusable in today's Internet.
>
>------------
>
>Somewhere in key features it should be mentioned that BGP makes the
>assumption that packets are routed from source towards destination
>independent of the source.  A good place for this would be near the
>statement "BGP does not make any assumptions about intra-autonomous
>system routing protocols deployed within the various autonomous
>systems".  Or refer to the statement in the beginning of
>"7. Applicability".
>
>------------
>
>In the following paragraph, why don't we just say that this algorithm
>is referred to as a "Path Vector" algorithm.
>
>   BGP uses an algorithm that is neither a pure distance vector
>   algorithm or a pure link state algorithm. It is instead a modified
>   distance vector algorithm that uses path information to avoid
>   traditional distance vector problems. Each route within BGP pairs
>   destination with path information to that destination. Path
>   information (also known as AS_PATH information) is stored within the
>   AS_PATH attribute in BGP. This allows BGP to reconstruct large
>   portions of overall topology whenever required.
>
>------------
>
>Alex probably made you put in some FSM stuff (don't read too much into
>the wording here).  I don't think it belongs in this document.  That
>is clearly something for the protocol spec.
>
>------------
>
>In the section "4. BGP Persistent Peer Oscillations" or in a nearby
>section (preferable) it should be mentioned that BGP is work
>conserving.  Here is some suggested text:
>
>   A robust BGP implementation is work conserving.  This means that if
>   the number of prefixes is bound, arbitrarily high levels of route
>   change can be tolerated with bounded impact on route convergence
>   for occasionaly changes in generally stable routes.
>
>   A BGP implementation under high load conditions should empty as
>   much inbound routing updates from its input streams, processing
>   only the most recent route if the route for a given NLRI changes
>   multiple times.  TCP also provides blocking on the writes on the
>   sender side.  A BGP implementation under load should expect blocks
>   on write calls and send only the most recent routes when sockets
>   unblock rather than sending entire history.  
>
>   A robust implemention of BGP should have the following
>   characteristics:
>
>      1.  It is able to operate in almost arbitrarily high levels of
>	  route flap without loosing peerings (failing to send
>	  keepalives) or loosing other protocol adjacencies as a
>	  result of BGP load.
>
>      2.  Instability of a subset of routes should not affect the
>          route advertisements or forwarding associated with the set
>          of stable routes.
>
>      3.  High levels of instability and peers of different CPU speed
>          or load resulting in faster or slower processing of routes
>          should not cause instability and should have a bounded
>          impact on the convergence time for generally stable routes.
>
>   Numerous robust BGP implementations exist.  Producing a robust
>   implementation is not a trivial matter but clearly acheivable.
>
>------------
>
>I find the following paragraph problematic without further
>explanation.
>
>   It is important to note that BGP does not require all the routers
>   within an autonomous system to participate in the BGP protocol. In
>   particular, only the border routers that provide connectivity between
>   the local autonomous system and their adjacent autonomous systems
>   need participate in BGP. The ability to constrain the set of BGP
>   speakers is one way to address scaling issues.
>
>Either you need to default to the borders and exit at any border or
>you need some mechanism to tunnel between border routers for a pure
>transport network.  I favor removing the above paragraph.  Tunnelling
>to remove BGP is out of scope.  Default routing to reach any arbitrary
>border need not be mentioned.  Things which actually do improve
>scaling within an AS are RR and confeds.
>
>------------
>
>Section "5.1 Link bandwidth and CPU utilization" may still be overly
>simplistic and as a result may be incorrect.  See comments below.
>
>
>------------
>
>In terms of bandwidth, the number of unique AS paths in practice is a
>small number compared to the number of NLRI.  Since many NLRI are
>packed in a single update with the AS path included only once, in
>practice the number of NLRI completely dominates the amount of
>bandwidth consumed.
>
>The MR = 4 * (N + (M * A)) may be inaccurate for the reasons I gave in
>the prior paragraph.  The M*A may drasticly understate the impact of
>the unique AS paths.  Instead of defining A as the number of AS, A
>could be defined as the number of unique AS paths with M*A then being
>average AS path length times number of unique AS paths.  
>
>Also why is both memory and bandwidth represented as MR?  Wouldn't BW
>be a better variable name for bandwidth?
>
>The O(C * M) thing in the next paragraph is also invalid above some
>value of C but for different reasons.  Above some value of C, either
>the sender will begin pacing it sending of updates on its own
>(suppressing multiple changes over very short periods) or the receiver
>will be unable to keep up with the rate of change and force
>suppression of multiple changes over very short periods by causing the
>BGP socket to block on the sender.
>
>------------
>
>The following statement is incorrect:
>
>   Finally, since the dynamic properties of the network cannot be
>   quantitatively bounded, stability must be addressed via heuristics
>   such as BGP Route Flap Damping [RFC2439]. Due to the nature of BGP,
>   such damping should be viewed as a matter local to an autonomous
>   system matter (see also Appendix F.2 of [BGP4]).
>
>The amount of change is inherently bounded in BGP (as I described
>above).  BGP Route Flap Damping was initially proposed for two
>reasons, 1) to protect a specific commercial implementation that was
>not sufficiently robust, 2) to improve convergence of stable routes.
>BGP Route Flap Damping is not necessary to bound the amount of change
>in BPG routing.
>
>------------
>
>We can drop the following comparison to a historic protocol:
>
>   It may also be instructive to compare bandwidth and CPU requirements
>   of BGP with the Exterior Gateway Protocol (EGP). While with BGP the
>   complete information is exchanged only at the connection
>   establishment time, with EGP the complete information is exchanged
>   periodically (usually every 3 minutes). Note that both for BGP and
>   for EGP the amount of information exchanged is roughly on the order
>   of the number of networks reachable via a peer that sends the
>   information. Therefore, even if one assumes extreme  instabilities of
>   BGP, its worst case behavior will be the same as the steady state
>   behavior of its predecessor, EGP.
>
>   Operational experience with BGP showed that the incremental update
>   approach employed by BGP provides qualitative improvement in both
>   bandwidth and CPU utilization when compared with complete periodic
>   updates used by EGP (see also presentation by Dennis Ferguson at the
>   Twentieth IETF, March 11-15, 1991, St. Louis).
>
>We should drop other references to EGP.
>
>------------
>
>In "5.1.2. Memory requirements", the MR = O((N + M * A) * K) the same
>comment applies regarding the M*A term.  A should be unique AS paths,
>not number of AS and is not multiplied by K.  In practice the K term
>is small because it is the number of peers sending full routing, which
>is generally much less than the worst case number of peers.  Large
>providers who carry full routing typically send each other only their
>customer routes to avoid providing free transit to each other.  This
>reduces the impact of K.
>
>------------
>
>We can drop:
>
>   It is interesting to note that prior to the introduction of BGP in
>   the NSFNET Backbone, memory requirements on the NSFNET Backbone
>   routers running EGP were on the order of O(N *K).
>
>------------
>
>In the MR = ((N*4) + (M*A)*2) * K, we can make this quite accurate by
>defining N as the average number of routes advertised by each peer and
>A as the number of unique AS paths, moving (M*A)*2) outside of the *K
>and changing N*4 to N*R and (M*A)*2) to (M*A)*P) where R is the number
>of bytes required to store a route and P is the number of bytes needed
>to store one AS in an AS path.  If K is small, then some overhead such
>as the patricia trie storage figures into R and if K is large, the
>data structures may be linked lists off the patricia trie.  Claiming
>that a route can be stored in 4 bytes is rather naive.
>
>The N*R*K term in practice dominates over the (M*A)*P term.  If we
>conservatively estimate a route as taking 16 bytes (large K allowing
>patricia trie overhead to be ignored, indices or pointers to unique AS
>Path or other attributes, etc).  If we wanted to include a term to
>make things accurate for small K we could add U*X where U is the
>number of unique NRLI and X is the overhead per unique NLRI (and I ran
>out of useful single letters).  Typically X is greater than R.
>
>------------
>
>   Interestingly, in his review of the BGP protocol for the BGP review
>   committee in March of 1990, Paul Tsuchiya noted that "BGP does not
>   scale well.
>
>Paul was wrong.  It does scale well and that's why it is being used.
>BGP is a solution that scales as well as the problem allows and no
>better.
>
>------------
>
>In "10. Security Considerations" we can reference the separate
>security analysis document.  Or maybe not.
>
>
>  
>



Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id LAA12318 for <idr-archive@nic.merit.edu>; Wed, 14 May 2003 11:10:18 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 2FDC19129B; Wed, 14 May 2003 11:09:44 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id F376E9129C; Wed, 14 May 2003 11:09:43 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 63B849129B for <idr@trapdoor.merit.edu>; Wed, 14 May 2003 11:09:42 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 512425E0F7; Wed, 14 May 2003 11:09:42 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from workhorse.fictitious.org (workhorse.fictitious.org [209.150.1.230]) by segue.merit.edu (Postfix) with ESMTP id B5C3C5E089 for <idr@merit.edu>; Wed, 14 May 2003 11:09:39 -0400 (EDT)
Received: from workhorse.fictitious.org (localhost.fictitious.org [127.0.0.1]) by workhorse.fictitious.org (8.9.3/8.9.3) with ESMTP id LAA79394; Wed, 14 May 2003 11:08:04 -0400 (EDT) (envelope-from curtis@workhorse.fictitious.org)
Message-Id: <200305141508.LAA79394@workhorse.fictitious.org>
To: "Joris Dobbelsteen" <joris.dobbelsteen@mail.com>
Cc: "'IDR WG (E-mail)'" <idr@merit.edu>
Reply-To: curtis@fictitious.org
Subject: Re: Issue 19) Security Considerations 
In-reply-to: Your message of "Mon, 14 Apr 2003 23:32:56 +0200." <001201c302cd$6ba36f10$0d0ca8c0@joris2k.local> 
Date: Wed, 14 May 2003 11:08:04 -0400
From: Curtis Villamizar <curtis@fictitious.org>
Sender: owner-idr@merit.edu
Precedence: bulk

In message <001201c302cd$6ba36f10$0d0ca8c0@joris2k.local>, "Joris Dobbelsteen" 
writes:
> Curtis, it was not my intension to upset you in any way. Still it looks
> like I successed doing that....... Any way, we do need some feedback
> from practical deployments.

I intended to respond to this message after clearing a few things up,
but it got deferred for much longer than it should have.  No offense
taken.  None intended either.

> My personal consideration would be not to put it in a chapter 4.2 but
> get it throughout the entire draft.

That would be fine.

> Packet filters provide the good protection for IBGP sessions. These are
> harder/impractical for EBGP(?).
> Internal attacks and hacks on the network channels are not considered
> practical.

The reason that packet filters are hard to get deployed on EBGP is the
attack can come from your peer and you have no control over your
peer's router, except to request that filters be put in place.

If an internal router is compromised there are many attacks possible
in addition to BGP.  The same applies to an internal compromise to
other critical infrastructure machines such as NMS.

> IPSec is not recommended because at least three reasons:
> TCP-MD5 is already widely implemented and deployed.
> Complication in network management, because packet filters need to be
> adapted for IPSec traffic (port 500 - IKE).
> IPSec has higher processing demands that TCP-MD5 does, lowing the
> barrier for a successful DoS attack. Hardware IPSec implementations are
> considered impractical, due to the cost or performance.
> 
> This brings up your statement that IPSec covers the ports. This is not
> true, when encryption (ESP) is not used. The use of intigrity (AH)
> will not hide the ports, it works sorta TCP-MD5, if you want...
> I also did never consider using ESP, since it was not desired and it
> would be the way to 'promote' DoS attacks.
> Unfortunally I don't have any insight on IPSec hardware, maybe some else
> can give some insight on this.
> 
> I suppose this closes the IPSec/TCP-MD5 issue, leaving TCP-MD5 to the 
> current best option.

Use of AH is an option.  It offers little benefit over TCP-MD5.  It
would be better if there was a way for ESP to expose the ports.

> Still this leaves EBGP open for consideration.
> 
> The use of BGP TTL Security Hack (BTSH) might be a very simple and
> effective way of protecting the external routers from attacks,
> especially DoS attacks. Malicious data can be prevented using TCP-MD5.
> The disadvantage is that it is not deployed widely, but only requires
> routers to send there traffic with an initial TTL of 255. I don't
> know if this is currently possible to archieve: whether current
> implementations can be set to send with an initial TTL of 255. It is
> not needed for both end-points to support BTSH, although desired.
> <draft-gill-btsh-01.txt>

It is simple.  It is easy.  It works.  This TTL check can be done in
hardware in many (most?) routers.

Another mitigation of attack is to build routers such that the BGP TCP
(and optionally the whole Adj-In and much of the AdjOut handling) is
done on the line card, minimally the TCP-MD5 authentication.  Some
routers also include internal hardware queues that can be used for SFQ
handling of traffic to the line card from outside.  If an attack
occurs on an EBGP peer, the attack only affects peers served by that
line card and if SFQ is available may only affect that peer.  If so, a
peers that do not implement (or enable) filtering or BTSH affects only
their own connectivity.

> I couldn't find any information about dynamic 4-tuple filtering
> (google turns up BGP dynamic capabilities), unfortunally.

Vijay Gill gave a presentation at NANOG.  Once established a 4-tuple
filter (src/dst addr+port) is installed for a BGP peering giving that
traffic priority over other 4-tuples.  A DoS attack on the intigrity
check then impacts the ability to get to the established state but
does not impact established connections.

> So what else can be done to prevent a EBGP session from attacks,
> especially DoS attacks? Data insertion/manipulations can be guarded
> against using TCP-MD5 (or similar).

Traffic based DoS which overwhelms the intigrity check is the hardest
problem we currently face.  It is solved for IBGP by adequate
filtering and keeping the routers or other infrastructure machines
from being compromised.  BTSH solves it for EBGP unless the peer is
compromised.  Limiting the impact of a full bandwidth attack from a
peer is something that a limited set of routers may be capable of.

> - Joris

Sorry for the delay.  I deferred a response to this and then it got
lost/buried in my inbox.

Curtis


> >-----Original Message-----
> >From: owner-idr@merit.edu [mailto:owner-idr@merit.edu]On Behalf Of
> >Curtis Villamizar
> >Sent: Thursday, 3 April 2003 16:36
> >To: Joris Dobbelsteen
> >Cc: 'IDR WG (E-mail)'
> >Subject: Re: Issue 19) Security Considerations 
> >
> >In message <001b01c2f52e$48505480$0d0ca8c0@joris2k.local>, 
> >"Joris Dobbelsteen" 
> >writes:
> >> [snip]
> >
> >Joris,
> >
> >Quite frankly I'm outraged at such comments.  Only by looking at
> >theoretical security issues and ignoring reality (not that the two
> >don't highly but imperfectly overlap) can you come to the conclusion
> >that IPSEC is needed and in its current form is viable as a security
> >solution for BGP.
> >
> >I think its about time we injected some reality into
> >draft-murphy-bgp-vuln-02.txt.
> >
> >I've added a practical considerations section.  I stuck it in as 4.2.
> >
> >Comments are welcome, particularly comments from people actually
> >running BGP networks or building BGP routers used by ISPs.
> >
> >I did not mention advanced filtering works-in-progress or proposals
> >such as BTSH or dynamic 4-tuple EBGP filtering since these are not yet
> >implemented or deployed afaik.  [aside: I strongly believe that BTSH
> >will prove to be a viable to protect EBGP and a preferable replacement
> >for current filtering which some older TTM (time-to-market) line cards
> >still in use are unable to support.]
> >
> >I should also note that the filtering best practices are far from
> >universally deployed and in some cases are difficult to fully deploy
> >due to residual use of TTM line cards unable to support filtering.
> >
> >Note that IPSEC with port numbers exposed would be a viable security
> >solution.  It would still be a greater computational burden than
> >TCP-MD5 and still might be less preferred by ISPs for that reason for
> >some architectures.  This change to IPSEC would at least yield two
> >viable options and might encourage implementation and deployment of
> >IPSEC as a security solution for BGP.
> >
> >Curtis
> >
> >
> >--- draft-murphy-bgp-vuln-02.txt	Wed Mar  5 21:00:00 2003
> >+++ draft-murphy-bgp-vuln-02.txt++	Thu Apr  3 09:18:12 2003
> >@@ -149,6 +149,7 @@
> > 3.2.2.2 Timer events 
> >..............................................   16
> > 4 Security Considerations 
> >.........................................   16
> > 4.1 Residual Risk 
> >.................................................   16
> >+4.2 Practical Considerations 
> >......................................   16
> > 5 References 
> >......................................................   17
> > 6 Author's Address 
> >................................................   18
> > 
> >@@ -901,6 +902,79 @@
> > Filtering is in use near some customer attachment points, but is not
> > effective near the Internet center.  The other mechanisms are still
> > controversial and are not yet in common use.
> >+
> >+4.2 Practical Considerations
> >+
> >+The primary usage of BGP is as a means to provide reachability
> >+information to Autonomous Systems (AS) and to distribute external
> >+reachability internally within an AS.  BGP is the routing protocol
> >+used to distribute global routing information in the Internet.  BGP is
> >+therefore used by all major Internet Service Providers (ISP) and many
> >+smaller providers and other organizations.  
> >+
> >+The role which BGP plays in the Internet puts BGP implementations in
> >+unique conditions and places unique security requirements on BGP.  BGP
> >+is operated over interprovider interfaces in which traffic levels push
> >+the state of the art in specialized packet forwarding hardware and
> >+exceed the performace capabilities of hardware implementation of
> >+decryption by many decimal orders of magnitude.  
> >+
> >+ISP networks must be and are under tight control.  The only viable
> >+means to protect the network elements from Denial of Service (DoS)
> >+attacks under such conditions are packet based filtering techniques
> >+based on relatively simple inspections of packets.
> >+
> >+To protect Internal BGP (IBGP) sessions, filters are applied at all
> >+borders to an ISP network which remove all traffic destined for
> >+addresses of network elements internal addresses (typically contained
> >+within a single prefix) and the BGP port number (179).  Packets from
> >+within an ISP are not forwarded from an internal interface to the BGP
> >+speaker's address on which External BGP (EBGP) sessions are supported,
> >+or to a peer's EBGP address if the BGP port number is found.  With
> >+appropriate consideration in router design, in the event of failure of
> >+a BGP peer to provide the equivalent filtering the risk of compromise
> >+can be limited to the peering session on which filtering is not
> >+performed by the peer or the interface or line card on which the
> >+peering is supported.  There is substantial motivation and little
> >+effort for ISPs to maintain such filters.
> >+
> >+Being composed entirely of specialized network equipment, under strict
> >+control of the ISP, the ISP network is not subject to attacks from
> >+within than enterprise networks are with more generalized computing
> >+systems and staff less carefully trained in the area of secure
> >+procedures.  ...
> 
> For me personally, the above sentence is a little hard to understand.
> You mean that ???
> The Internal BGP routers (or specialized network equipment) that is under
> strict control of the ISP is not subject to attacks, other than those that
> are common in enterprise networks. These networks have staff that is
> less carefully trained in the area of security procedures.
> 
> >+ ...........  Monitoring of traffic from within requires either
> >+compromise of relatively physically secure and carefully administered
> >+network elements or monitoring physical media.  Injection of traffic
> >+requires either compromise of network elements or intercept and
> >+replacement of traffic on physical media.
> >+
> >+The difficulty of compromise of network elements and of undetected
> >+tapping into physical media carrying extremely high volumes of traffic
> >+is much greater than the difficulty of injecting sufficient traffic
> >+from outside a network to effect a DoS attack.  As a result, the
> >+ability to packet filter on the basis of port numbers far exceeds the
> >+need to cryptographic strength in encapsulation.
> >+
> >+These practical considerations yield the situation in which TCP-MD5,
> >+though cryptographic weak, far better serves ISP security needs than
> >+the cryptographicly much stronger IPSEC which makes packet filtering
> >+infeasible.
> >+
> >+Use of BGP in smaller networks yields similar requirements.  The
> >+capability of a single workstation with high speed interface to
> >+generate false traffic far exceeds the capability of software based
> >+decryption or appropriately priced cryptographic hardware.  From a
> >+practical standpoint, these networks are also better served by
> >+appropriate administrative care, filtering, and TCP-MD5 than by IPSEC.
> >+
> >+This situation is likely to persist unless either cryptographic
> >+hardware becomes many orders of magnitude faster and cheaper or IPSEC
> >+supports an ability to leave IP port numbers exposed.  This
> >+requirement has been made known to the IPSEC WG.
> >+
> 
> See above, using intigrity only (AH), leaves TCP (not IP) port numbers
> readable for everyone. This security is rather an IPSec configuration
> option.
> 
> >+Until such time as IPSEC is modified, there is little choice but to
> >+mandate TCP-MD5 implementation and recommend TCP-MD5 usage for BGP and
> >+discourage IPSEC usage for BGP.
> > 
> > 5.  References
> > 
> >
> 


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id KAA09681 for <idr-archive@nic.merit.edu>; Wed, 14 May 2003 10:10:46 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id D1C9491214; Wed, 14 May 2003 10:10:13 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 995A791299; Wed, 14 May 2003 10:10:13 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 1580791214 for <idr@trapdoor.merit.edu>; Wed, 14 May 2003 10:10:12 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id F23CC5E0E3; Wed, 14 May 2003 10:10:11 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from workhorse.fictitious.org (workhorse.fictitious.org [209.150.1.230]) by segue.merit.edu (Postfix) with ESMTP id CDCB65E0D7 for <idr@merit.edu>; Wed, 14 May 2003 10:10:09 -0400 (EDT)
Received: from workhorse.fictitious.org (localhost.fictitious.org [127.0.0.1]) by workhorse.fictitious.org (8.9.3/8.9.3) with ESMTP id KAA78532; Wed, 14 May 2003 10:08:41 -0400 (EDT) (envelope-from curtis@workhorse.fictitious.org)
Message-Id: <200305141408.KAA78532@workhorse.fictitious.org>
To: idr@merit.edu
Cc: curtis@fictitious.org
Reply-To: curtis@fictitious.org
Subject: draft-ietf-idr-bgp-analysis-03.txt
Date: Wed, 14 May 2003 10:08:40 -0400
From: Curtis Villamizar <curtis@fictitious.org>
Sender: owner-idr@merit.edu
Precedence: bulk

David, Keyur,

I have some suggestions for improvement to this draft ( BGP-4 Protocol
Analysis <draft-ietf-idr-bgp-analysis-03.txt>).  See comments inline
below.  Feel free to take what you consider valid and worth changing.

Curtis


In "1. Introduction" you should mention that BGP4 was the first to
support CIDR and due to their lack of support for CIDR versions 1-3
are considered obsolete and unusable in today's Internet.

------------

Somewhere in key features it should be mentioned that BGP makes the
assumption that packets are routed from source towards destination
independent of the source.  A good place for this would be near the
statement "BGP does not make any assumptions about intra-autonomous
system routing protocols deployed within the various autonomous
systems".  Or refer to the statement in the beginning of
"7. Applicability".

------------

In the following paragraph, why don't we just say that this algorithm
is referred to as a "Path Vector" algorithm.

   BGP uses an algorithm that is neither a pure distance vector
   algorithm or a pure link state algorithm. It is instead a modified
   distance vector algorithm that uses path information to avoid
   traditional distance vector problems. Each route within BGP pairs
   destination with path information to that destination. Path
   information (also known as AS_PATH information) is stored within the
   AS_PATH attribute in BGP. This allows BGP to reconstruct large
   portions of overall topology whenever required.

------------

Alex probably made you put in some FSM stuff (don't read too much into
the wording here).  I don't think it belongs in this document.  That
is clearly something for the protocol spec.

------------

In the section "4. BGP Persistent Peer Oscillations" or in a nearby
section (preferable) it should be mentioned that BGP is work
conserving.  Here is some suggested text:

   A robust BGP implementation is work conserving.  This means that if
   the number of prefixes is bound, arbitrarily high levels of route
   change can be tolerated with bounded impact on route convergence
   for occasionaly changes in generally stable routes.

   A BGP implementation under high load conditions should empty as
   much inbound routing updates from its input streams, processing
   only the most recent route if the route for a given NLRI changes
   multiple times.  TCP also provides blocking on the writes on the
   sender side.  A BGP implementation under load should expect blocks
   on write calls and send only the most recent routes when sockets
   unblock rather than sending entire history.  

   A robust implemention of BGP should have the following
   characteristics:

      1.  It is able to operate in almost arbitrarily high levels of
	  route flap without loosing peerings (failing to send
	  keepalives) or loosing other protocol adjacencies as a
	  result of BGP load.

      2.  Instability of a subset of routes should not affect the
          route advertisements or forwarding associated with the set
          of stable routes.

      3.  High levels of instability and peers of different CPU speed
          or load resulting in faster or slower processing of routes
          should not cause instability and should have a bounded
          impact on the convergence time for generally stable routes.

   Numerous robust BGP implementations exist.  Producing a robust
   implementation is not a trivial matter but clearly acheivable.

------------

I find the following paragraph problematic without further
explanation.

   It is important to note that BGP does not require all the routers
   within an autonomous system to participate in the BGP protocol. In
   particular, only the border routers that provide connectivity between
   the local autonomous system and their adjacent autonomous systems
   need participate in BGP. The ability to constrain the set of BGP
   speakers is one way to address scaling issues.

Either you need to default to the borders and exit at any border or
you need some mechanism to tunnel between border routers for a pure
transport network.  I favor removing the above paragraph.  Tunnelling
to remove BGP is out of scope.  Default routing to reach any arbitrary
border need not be mentioned.  Things which actually do improve
scaling within an AS are RR and confeds.

------------

Section "5.1 Link bandwidth and CPU utilization" may still be overly
simplistic and as a result may be incorrect.  See comments below.


------------

In terms of bandwidth, the number of unique AS paths in practice is a
small number compared to the number of NLRI.  Since many NLRI are
packed in a single update with the AS path included only once, in
practice the number of NLRI completely dominates the amount of
bandwidth consumed.

The MR = 4 * (N + (M * A)) may be inaccurate for the reasons I gave in
the prior paragraph.  The M*A may drasticly understate the impact of
the unique AS paths.  Instead of defining A as the number of AS, A
could be defined as the number of unique AS paths with M*A then being
average AS path length times number of unique AS paths.  

Also why is both memory and bandwidth represented as MR?  Wouldn't BW
be a better variable name for bandwidth?

The O(C * M) thing in the next paragraph is also invalid above some
value of C but for different reasons.  Above some value of C, either
the sender will begin pacing it sending of updates on its own
(suppressing multiple changes over very short periods) or the receiver
will be unable to keep up with the rate of change and force
suppression of multiple changes over very short periods by causing the
BGP socket to block on the sender.

------------

The following statement is incorrect:

   Finally, since the dynamic properties of the network cannot be
   quantitatively bounded, stability must be addressed via heuristics
   such as BGP Route Flap Damping [RFC2439]. Due to the nature of BGP,
   such damping should be viewed as a matter local to an autonomous
   system matter (see also Appendix F.2 of [BGP4]).

The amount of change is inherently bounded in BGP (as I described
above).  BGP Route Flap Damping was initially proposed for two
reasons, 1) to protect a specific commercial implementation that was
not sufficiently robust, 2) to improve convergence of stable routes.
BGP Route Flap Damping is not necessary to bound the amount of change
in BPG routing.

------------

We can drop the following comparison to a historic protocol:

   It may also be instructive to compare bandwidth and CPU requirements
   of BGP with the Exterior Gateway Protocol (EGP). While with BGP the
   complete information is exchanged only at the connection
   establishment time, with EGP the complete information is exchanged
   periodically (usually every 3 minutes). Note that both for BGP and
   for EGP the amount of information exchanged is roughly on the order
   of the number of networks reachable via a peer that sends the
   information. Therefore, even if one assumes extreme  instabilities of
   BGP, its worst case behavior will be the same as the steady state
   behavior of its predecessor, EGP.

   Operational experience with BGP showed that the incremental update
   approach employed by BGP provides qualitative improvement in both
   bandwidth and CPU utilization when compared with complete periodic
   updates used by EGP (see also presentation by Dennis Ferguson at the
   Twentieth IETF, March 11-15, 1991, St. Louis).

We should drop other references to EGP.

------------

In "5.1.2. Memory requirements", the MR = O((N + M * A) * K) the same
comment applies regarding the M*A term.  A should be unique AS paths,
not number of AS and is not multiplied by K.  In practice the K term
is small because it is the number of peers sending full routing, which
is generally much less than the worst case number of peers.  Large
providers who carry full routing typically send each other only their
customer routes to avoid providing free transit to each other.  This
reduces the impact of K.

------------

We can drop:

   It is interesting to note that prior to the introduction of BGP in
   the NSFNET Backbone, memory requirements on the NSFNET Backbone
   routers running EGP were on the order of O(N *K).

------------

In the MR = ((N*4) + (M*A)*2) * K, we can make this quite accurate by
defining N as the average number of routes advertised by each peer and
A as the number of unique AS paths, moving (M*A)*2) outside of the *K
and changing N*4 to N*R and (M*A)*2) to (M*A)*P) where R is the number
of bytes required to store a route and P is the number of bytes needed
to store one AS in an AS path.  If K is small, then some overhead such
as the patricia trie storage figures into R and if K is large, the
data structures may be linked lists off the patricia trie.  Claiming
that a route can be stored in 4 bytes is rather naive.

The N*R*K term in practice dominates over the (M*A)*P term.  If we
conservatively estimate a route as taking 16 bytes (large K allowing
patricia trie overhead to be ignored, indices or pointers to unique AS
Path or other attributes, etc).  If we wanted to include a term to
make things accurate for small K we could add U*X where U is the
number of unique NRLI and X is the overhead per unique NLRI (and I ran
out of useful single letters).  Typically X is greater than R.

------------

   Interestingly, in his review of the BGP protocol for the BGP review
   committee in March of 1990, Paul Tsuchiya noted that "BGP does not
   scale well.

Paul was wrong.  It does scale well and that's why it is being used.
BGP is a solution that scales as well as the problem allows and no
better.

------------

In "10. Security Considerations" we can reference the separate
security analysis document.  Or maybe not.



Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id HAA04703 for <idr-archive@nic.merit.edu>; Wed, 14 May 2003 07:24:15 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id C64D591210; Wed, 14 May 2003 07:23:47 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 87C0291211; Wed, 14 May 2003 07:23:47 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 4D1A391210 for <idr@trapdoor.merit.edu>; Wed, 14 May 2003 07:23:46 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 356D35E05E; Wed, 14 May 2003 07:23:46 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from ietf.org (odin.ietf.org [132.151.1.176]) by segue.merit.edu (Postfix) with ESMTP id B505E5DEDD for <idr@merit.edu>; Wed, 14 May 2003 07:23:45 -0400 (EDT)
Received: from CNRI.Reston.VA.US (localhost [127.0.0.1]) by ietf.org (8.9.1a/8.9.1a) with ESMTP id HAA12720; Wed, 14 May 2003 07:20:42 -0400 (EDT)
Message-Id: <200305141120.HAA12720@ietf.org>
Mime-Version: 1.0
Content-Type: Multipart/Mixed; Boundary="NextPart"
To: IETF-Announce: ;
Cc: idr@merit.edu
From: Internet-Drafts@ietf.org
Reply-To: Internet-Drafts@ietf.org
Subject: I-D ACTION:draft-ietf-idr-bgp-analysis-03.txt
Date: Wed, 14 May 2003 07:20:42 -0400
Sender: owner-idr@merit.edu
Precedence: bulk

--NextPart

A New Internet-Draft is available from the on-line Internet-Drafts directories.
This draft is a work item of the Inter-Domain Routing Working Group of the IETF.

	Title		: BGP-4 Protocol Analysis
	Author(s)	: D. Meyer, K. Patel
	Filename	: draft-ietf-idr-bgp-analysis-03.txt
	Pages		: 19
	Date		: 2003-5-13
	
The purpose of this report is to document how the requirements for
advancing a routing protocol from Draft Standard to full Standard
have been satisfied by Border Gateway Protocol version 4 (BGP-4).
This report satisfies the requirement for'the second report', as
described in Section 6.0 of RFC 1264 [RFC1264].  In order to fulfill
the requirement, this report augments RFC 1774 [RFC1774] and
summarizes the key features of BGP protocol, and analyzes the
protocol with respect to scaling and performance.

A URL for this Internet-Draft is:
http://www.ietf.org/internet-drafts/draft-ietf-idr-bgp-analysis-03.txt

To remove yourself from the IETF Announcement list, send a message to 
ietf-announce-request with the word unsubscribe in the body of the message.

Internet-Drafts are also available by anonymous FTP. Login with the username
"anonymous" and a password of your e-mail address. After logging in,
type "cd internet-drafts" and then
	"get draft-ietf-idr-bgp-analysis-03.txt".

A list of Internet-Drafts directories can be found in
http://www.ietf.org/shadow.html 
or ftp://ftp.ietf.org/ietf/1shadow-sites.txt


Internet-Drafts can also be obtained by e-mail.

Send a message to:
	mailserv@ietf.org.
In the body type:
	"FILE /internet-drafts/draft-ietf-idr-bgp-analysis-03.txt".
	
NOTE:	The mail server at ietf.org can return the document in
	MIME-encoded form by using the "mpack" utility.  To use this
	feature, insert the command "ENCODING mime" before the "FILE"
	command.  To decode the response(s), you will need "munpack" or
	a MIME-compliant mail reader.  Different MIME-compliant mail readers
	exhibit different behavior, especially when dealing with
	"multipart" MIME messages (i.e. documents which have been split
	up into multiple messages), so check your local documentation on
	how to manipulate these messages.
		
		
Below is the data which will enable a MIME compliant mail reader
implementation to automatically retrieve the ASCII version of the
Internet-Draft.

--NextPart
Content-Type: Multipart/Alternative; Boundary="OtherAccess"

--OtherAccess
Content-Type: Message/External-body;
	access-type="mail-server";
	server="mailserv@ietf.org"

Content-Type: text/plain
Content-ID:	<2003-5-13133436.I-D@ietf.org>

ENCODING mime
FILE /internet-drafts/draft-ietf-idr-bgp-analysis-03.txt

--OtherAccess
Content-Type: Message/External-body;
	name="draft-ietf-idr-bgp-analysis-03.txt";
	site="ftp.ietf.org";
	access-type="anon-ftp";
	directory="internet-drafts"

Content-Type: text/plain
Content-ID:	<2003-5-13133436.I-D@ietf.org>

--OtherAccess--

--NextPart--




Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id DAA08041 for <idr-archive@nic.merit.edu>; Tue, 13 May 2003 03:26:33 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id E89CA9126F; Tue, 13 May 2003 03:26:07 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id B00B391270; Tue, 13 May 2003 03:26:06 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 81A689126F for <idr@trapdoor.merit.edu>; Tue, 13 May 2003 03:26:05 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 5CC7A5E067; Tue, 13 May 2003 03:26:05 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from sj-core-2.cisco.com (sj-core-2.cisco.com [171.71.177.254]) by segue.merit.edu (Postfix) with ESMTP id 0B95F5E063 for <idr@merit.edu>; Tue, 13 May 2003 03:26:05 -0400 (EDT)
Received: from cisco.com (router.cisco.com [171.69.182.20]) by sj-core-2.cisco.com (8.12.6/8.12.6) with ESMTP id h4D7Q2iQ010777; Tue, 13 May 2003 00:26:02 -0700 (PDT)
Received: from [193.0.9.150] (ssh-ams-1.cisco.com [144.254.74.55]) by cisco.com (8.8.8/2.6/Cisco List Logging/8.8.8) with ESMTP id DAA15367; Tue, 13 May 2003 03:26:00 -0400 (EDT)
Mime-Version: 1.0
X-Sender: jgs@router
Message-Id: <p05210602bae64b4a3ff7@[193.0.9.150]>
In-Reply-To: <20030512142546.E5895@nexthop.com>
References: <20030512095637.B5895@nexthop.com> <200305121808.h4CI8fH9022253@rtp-core-1.cisco.com> <20030512142546.E5895@nexthop.com>
Date: Tue, 13 May 2003 09:15:20 +0200
To: Jeffrey Haas <jhaas@nexthop.com>
From: "John G. Scudder" <jgs@cisco.com>
Subject: Re: On BGP and VPLS
Cc: Eric Rosen <erosen@cisco.com>, idr@merit.edu
Content-Type: text/plain; charset="us-ascii" ; format="flowed"
Sender: owner-idr@merit.edu
Precedence: bulk

At 2:25 PM -0400 5/12/03, Jeffrey Haas wrote:
>If you flood more than one type
>of reachability, you get to make hard choices such as which reachability
>do you want to converge faster and if you start exceeding the resource
>bounds of your router, which information do you toss?

OK, but this is orthogonal to whether you carry said information in 
BGP or some other protocol, since they're all sharing the same 
resources.

By the way, I think this use of the term "flooding" is rather 
unfortunate since it already has a well understood meaning in the 
routing protocol context, and it suggests link-state.

--John


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id OAA20556 for <idr-archive@nic.merit.edu>; Mon, 12 May 2003 14:27:26 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 0A3E59125E; Mon, 12 May 2003 14:27:04 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id C5D0E9125F; Mon, 12 May 2003 14:27:03 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id B1D059125E for <idr@trapdoor.merit.edu>; Mon, 12 May 2003 14:27:02 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 8CF735DE20; Mon, 12 May 2003 14:27:02 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from presque.nexthop.com (dns.nexthop.com [65.247.36.216]) by segue.merit.edu (Postfix) with ESMTP id 5D5425DE1D for <idr@merit.edu>; Mon, 12 May 2003 14:27:02 -0400 (EDT)
Received: (from root@localhost) by presque.nexthop.com (8.12.8/8.11.1) id h4CIPtIf039930; Mon, 12 May 2003 14:25:55 -0400 (EDT) (envelope-from jhaas@jhaas.nexthop.com)
Received: from jhaas.nexthop.com (jhaas.nexthop.com [65.247.36.31]) by presque.nexthop.com (8.12.8/8.12.8) with ESMTP id h4CIPpWB039922; Mon, 12 May 2003 14:25:51 -0400 (EDT) (envelope-from jhaas@jhaas.nexthop.com)
Received: (from jhaas@localhost) by jhaas.nexthop.com (8.11.3nb1/8.11.3) id h4CIPk209204; Mon, 12 May 2003 14:25:46 -0400 (EDT)
Date: Mon, 12 May 2003 14:25:46 -0400
From: Jeffrey Haas <jhaas@nexthop.com>
To: Eric Rosen <erosen@cisco.com>
Cc: idr@merit.edu
Subject: Re: On BGP and VPLS
Message-ID: <20030512142546.E5895@nexthop.com>
References: <20030512095637.B5895@nexthop.com> <200305121808.h4CI8fH9022253@rtp-core-1.cisco.com>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <200305121808.h4CI8fH9022253@rtp-core-1.cisco.com>; from erosen@cisco.com on Mon, May 12, 2003 at 02:08:41PM -0400
X-Virus-Scanned: by AMaViS perl-11
Sender: owner-idr@merit.edu
Precedence: bulk

Aside from noting that sctp would be a fine way to do the multiple
streams, I think all of my issues have relatively little to do with
technical issues and are really implementation/operational ones.

If its reachability you're flooding, and the information changes
on a hop-by-hop basis, BGP is fine.  If you flood more than one type
of reachability, you get to make hard choices such as which reachability
do you want to converge faster and if you start exceeding the resource
bounds of your router, which information do you toss?

The more that you cram into one package, the more rope you give people.
Feedback from several network operators often makes me think we've already
given them too many chain-lengths of rope. :-)

On Mon, May 12, 2003 at 02:08:41PM -0400, Eric Rosen wrote:
> I think a vendor would be unlikely  to "start from the spec".  A more likely
> implementation strategy  would be to allow  BGP connections on  both the old
> port and the  new port.  Capability advertisement or  ORF or something would
> be  used  to  choose  which  kind  of  info  gets  forwarded  on  which  BGP
> connections.  
> 
> Of course,  one day  someone would notice  that this might  require multiple
> parallel TCP connections  where a single one might do just  as well.  So the
> suggestion would probably be made that,  as an optimization, one could use a
> single port,  but encode the  different kinds of  data as NLRI  of different
> address families.
> 
> So saying "use a different port" doesn't really make the problem go away. 

-- 
Jeff Haas 
NextHop Technologies


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id OAA20454 for <idr-archive@nic.merit.edu>; Mon, 12 May 2003 14:09:08 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id D70AC9125D; Mon, 12 May 2003 14:08:45 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id A48909125E; Mon, 12 May 2003 14:08:45 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id A2D2A9125D for <idr@trapdoor.merit.edu>; Mon, 12 May 2003 14:08:44 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 7C7365DE0E; Mon, 12 May 2003 14:08:44 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from rtp-core-1.cisco.com (rtp-core-1.cisco.com [64.102.124.12]) by segue.merit.edu (Postfix) with ESMTP id 17BFB5DE0B for <idr@merit.edu>; Mon, 12 May 2003 14:08:44 -0400 (EDT)
Received: from cisco.com (erosen-u10.cisco.com [161.44.70.36]) by rtp-core-1.cisco.com (8.12.6/8.12.6) with ESMTP id h4CI8fH9022253; Mon, 12 May 2003 14:08:41 -0400 (EDT)
Message-Id: <200305121808.h4CI8fH9022253@rtp-core-1.cisco.com>
To: Jeffrey Haas <jhaas@nexthop.com>
Cc: Pedro Roque Marques <roque@juniper.net>, idr@merit.edu
Subject: Re: On BGP and VPLS 
In-reply-to: Your message of Mon, 12 May 2003 09:56:37 -0400. <20030512095637.B5895@nexthop.com> 
Reply-To: erosen@cisco.com
User-Agent: EMH/1.14.1 SEMI/1.14.3 (Ushinoya) FLIM/1.14.3 (=?ISO-8859-4?Q?Unebigory=F2mae?=) APEL/10.3 Emacs/21.3 (sparc-sun-solaris2.8) MULE/5.0 (SAKAKI)
MIME-Version: 1.0 (generated by SEMI 1.14.3 - "Ushinoya")
Content-Type: text/plain; charset=US-ASCII
Date: Mon, 12 May 2003 14:08:41 -0400
From: Eric Rosen <erosen@cisco.com>
Sender: owner-idr@merit.edu
Precedence: bulk

Jeff> But  it'd be  nice for  everyone that  wants to  "co-opt  just another
Jeff> little piece of BGP because it works" that maybe you should start from
Jeff> the spec, get your own TCP port and flood the stuff in parallel.

I think a vendor would be unlikely  to "start from the spec".  A more likely
implementation strategy  would be to allow  BGP connections on  both the old
port and the  new port.  Capability advertisement or  ORF or something would
be  used  to  choose  which  kind  of  info  gets  forwarded  on  which  BGP
connections.  

Of course,  one day  someone would notice  that this might  require multiple
parallel TCP connections  where a single one might do just  as well.  So the
suggestion would probably be made that,  as an optimization, one could use a
single port,  but encode the  different kinds of  data as NLRI  of different
address families.

So saying "use a different port" doesn't really make the problem go away. 


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id JAA18153 for <idr-archive@nic.merit.edu>; Mon, 12 May 2003 09:57:10 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 7BCA591236; Mon, 12 May 2003 09:56:50 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 4351D9124A; Mon, 12 May 2003 09:56:50 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 3F61291236 for <idr@trapdoor.merit.edu>; Mon, 12 May 2003 09:56:49 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 27BC75E552; Mon, 12 May 2003 09:56:49 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from presque.nexthop.com (dns.nexthop.com [65.247.36.216]) by segue.merit.edu (Postfix) with ESMTP id E8CB05E56C for <idr@merit.edu>; Mon, 12 May 2003 09:56:48 -0400 (EDT)
Received: (from root@localhost) by presque.nexthop.com (8.12.8/8.11.1) id h4CDulVi032409; Mon, 12 May 2003 09:56:47 -0400 (EDT) (envelope-from jhaas@jhaas.nexthop.com)
Received: from jhaas.nexthop.com (jhaas.nexthop.com [65.247.36.31]) by presque.nexthop.com (8.12.8/8.12.8) with ESMTP id h4CDugWB032394; Mon, 12 May 2003 09:56:42 -0400 (EDT) (envelope-from jhaas@jhaas.nexthop.com)
Received: (from jhaas@localhost) by jhaas.nexthop.com (8.11.3nb1/8.11.3) id h4CDubW06114; Mon, 12 May 2003 09:56:37 -0400 (EDT)
Date: Mon, 12 May 2003 09:56:37 -0400
From: Jeffrey Haas <jhaas@nexthop.com>
To: Pedro Roque Marques <roque@juniper.net>
Cc: idr@merit.edu
Subject: Re: On BGP and VPLS
Message-ID: <20030512095637.B5895@nexthop.com>
References: <200305072030.h47KUOC64557@roque-bsd.juniper.net> <200305072106.RAA19411@workhorse.fictitious.org> <200305072115.h47LFdh64632@roque-bsd.juniper.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <200305072115.h47LFdh64632@roque-bsd.juniper.net>; from roque@juniper.net on Wed, May 07, 2003 at 02:15:39PM -0700
X-Virus-Scanned: by AMaViS perl-11
Sender: owner-idr@merit.edu
Precedence: bulk

On Wed, May 07, 2003 at 02:15:39PM -0700, Pedro Roque Marques wrote:
> My argument is that the flooding algorithm in the BGP spec is
> applicable to other NLRI-types.

You could flood it using NNTP too.  It doesn't mean its the One True
Answer.  I think that is most of Alex's point.

> And that the document can be, and has
> been sucesfully used to do so. i.e. it is still a coherent document
> when you interpret it in the context of a different NLRI-type.

There comes a time when you're just distributing too much stuff
in one protocol and take too many chances at destabilizing what
sort of works well.

I'm not saying that VPLS is like this, having never read the specs.
But it'd be nice for everyone that wants to "co-opt just another little
piece of BGP because it works" that maybe you should start from
the spec, get your own TCP port and flood the stuff in parallel.

Same kind of basket, just different basket.  Hopefully fewer broken
eggs.

Now I know what the DNS folk feel like.

>   Pedro.

-- 
Jeff Haas 
NextHop Technologies


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id KAA02786 for <idr-archive@nic.merit.edu>; Fri, 9 May 2003 10:14:26 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id ADBD99121A; Fri,  9 May 2003 10:14:02 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 715279121B; Fri,  9 May 2003 10:14:02 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 966A39121A for <idr@trapdoor.merit.edu>; Fri,  9 May 2003 10:14:00 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 7B3E55E0E3; Fri,  9 May 2003 10:14:00 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from rtp-core-2.cisco.com (rtp-core-2.cisco.com [64.102.124.13]) by segue.merit.edu (Postfix) with ESMTP id 9D98F5E0DE for <idr@merit.edu>; Fri,  9 May 2003 10:13:55 -0400 (EDT)
Received: from cisco.com (uzura.cisco.com [64.102.17.77]) by rtp-core-2.cisco.com (8.12.6/8.12.6) with ESMTP id h49EDNNE019286; Fri, 9 May 2003 10:13:23 -0400 (EDT)
Received: from russpc (rtp-vpn1-91.cisco.com [10.82.224.91]) by cisco.com (8.8.8/2.6/Cisco List Logging/8.8.8) with ESMTP id KAA11527; Fri, 9 May 2003 10:13:22 -0400 (EDT)
Date: Fri, 9 May 2003 10:13:14 -0400 (Eastern Daylight Time)
From: Russ White <ruwhite@cisco.com>
Reply-To: Russ White <riw@cisco.com>
To: idr@merit.edu
Cc: rtg-dir@ietf.org
Subject: Comments on BGP Draft 20.....
Message-ID: <Pine.WNT.4.53.0305090945390.2372@russpc>
X-X-Sender: ruwhite@uzura.cisco.com
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-idr@merit.edu
Precedence: bulk

Some of these are going to echo Alex's comments, but that's okay, I think.
Mostly just nits....

:-)

Russ

__________________________________
riw@cisco.com CCIE <>< Grace Alone

-----

Abstract:

This information is sufficeint to construct a graph of the AS connectivity
from which routing loops may be pruned and some policy decisions at the AS
level may be enforced.

UPDATE Message Format:

The information in the UPDATE message can be used to construct a graph
describing the relationships of the various Autonomous Systems.

In both cases this is true, I suppose, but in neither case does this really
describe what the AS Path is used for, right? I would think we'd want to
describe it less in terms of a "graph of the connectivity in the
internetwork," and more in terms of "a graph of the path through Autonomous
Systems ued to reach the destination advertised." It could be confusing,
since there isn't anyplace where we discuss building a graph of
inconnectivity between the Autonomous Systems....

-----

Forwarding Paradigm:

This document uses the term "Autonomous System" (AS)  throughout....

This entire paragraph is a repeat--I'd leave it just in the definitions.

-----

Forwarding Paradigms:

The initial data flow....

This paragraph has two different thoughts in it, one about incremental
updates, and the other about keeping data that you've received. It seems
like just putting a return after "as the routing tables change."

-----

Forwarding Paradigms:

The paragraph starting "KEEPALIVE messages" should, I think, be moved up
above the section on route exchange. I don't know why, it just seems less
like it's jumping all over the place that way.

-----

3.1 Routes: Advertisement and Storage

It almost seems like the section about The initial data flow should maybe
be put entirely under this section someplace (?).

The first paragraph in this section is really a definition of a route vs a
prefix, and should probably be in the definitions.

The paragraph "Changing attribute of a route...." needs a "the," or
attribute needs an "s."

-----

3.2 Routing Information Bases

b) Loc-RIB....

I think it might be useful to state the contents of the Loc-RIB are
actually installed in the local routing table, and thus used for forwarding
packets on this router. I don't see anyplace this connection is made
explicit, it seems more like it's implicit throughout the doc.

-----

Page 18, a) LOCAL_PREF

"....to inform other peers...." should be "....to inform its other
peers...."

-----

Network Layer Reachability Information

"This varibale length field contains a list of IP address prefixes."

I think we can kill "address" here.

a) Length

"The Length field inidicates...." The sentence can start with
"Indicates..."

b) Prefix

"The Prefix field indicates...." The sentence can start with
"Indicates...."

-----

Network Layer Reachability Information

"An UPDATE message can list multiple routes to be withdrawn...."

Actually, we don't withdraw routes, we withdraw prefixes, right? The next
paragraph shows this confusion, by talking about routes without attributes,
but routes are prefixes combined with attributes, so.... They aren't
routes, they're prefixes. You remove routes based on withdrawn prefixes, I
think.

------

5. Path Attributes

"Well-known attributes MUST be recognized by all BGP implementations."

This sentence, as strange as it may sound, implies it's the attributes
fault if the BGP implementation doesn't recogonize it, that it's up to the
attribute definers to, in some way, make certain that BGP implementations
will recognize it. I think it should probably be worded the other way
'round:

"BGP implementations MUST recognize all well-known attributes."

-----

5. Path Attributes

"All well-known attributes MUST be passed along (after proper updating, if
necessary) to other BGP peers."

This just seems a little rough. Maybe this:

"Once a BGP peer has updated any well-known attributes, it MUST pass these
attributes in any updates it transmits to its peers."

-----

5.1.1 ORIGIN

"Its value SHOULD NOT be changed by any other speaker."

I really think this should be "MUST NOT." I can't think of any reason it
wouldn't be, except in the case of aggregation, and that case could be
mentioned here as the only known exception (?).




Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id RAA11337 for <idr-archive@nic.merit.edu>; Wed, 7 May 2003 17:47:15 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 3D584912A2; Wed,  7 May 2003 17:46:11 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 0495D912A5; Wed,  7 May 2003 17:46:10 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 7581F912A2 for <idr@trapdoor.merit.edu>; Wed,  7 May 2003 17:46:05 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 5D2325DE3E; Wed,  7 May 2003 17:46:05 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from workhorse.fictitious.org (workhorse.fictitious.org [209.150.1.230]) by segue.merit.edu (Postfix) with ESMTP id 9A6F45DE34 for <idr@merit.edu>; Wed,  7 May 2003 17:46:04 -0400 (EDT)
Received: from workhorse.fictitious.org (localhost.fictitious.org [127.0.0.1]) by workhorse.fictitious.org (8.9.3/8.9.3) with ESMTP id RAA19938; Wed, 7 May 2003 17:45:48 -0400 (EDT) (envelope-from curtis@workhorse.fictitious.org)
Message-Id: <200305072145.RAA19938@workhorse.fictitious.org>
To: Pedro Roque Marques <roque@juniper.net>
Cc: curtis@fictitious.org, ppvpn@nortelnetworks.com, idr@merit.edu
Reply-To: curtis@fictitious.org
Subject: Re: On BGP and VPLS 
In-reply-to: Your message of "Wed, 07 May 2003 14:15:39 PDT." <200305072115.h47LFdh64632@roque-bsd.juniper.net> 
Date: Wed, 07 May 2003 17:45:48 -0400
From: Curtis Villamizar <curtis@fictitious.org>
Sender: owner-idr@merit.edu
Precedence: bulk

In message <200305072115.h47LFdh64632@roque-bsd.juniper.net>, Pedro Roque Marqu
es writes:
> Curtis Villamizar writes:
> 
> > Pedro,
> 
> > This is the BGP4 base spec.  The interpretation of the BGP
> > advertisements are IP prefix aka NLRI for which routes are
> > advertised and not general keys mapping to general records.
> 
> > If some extension of BGP4 such as VPN or PW makes some other use of
> > BGP flooding then that's fine but need not be reflected in the base
> > document.
> 
> > The changes that you are only vaguely specifying (s/route/record/g
> > s/IP prefix/key/g) doesn't at all pertain to the use of BGP as
> > defined in the base spec, hasn't been interpreted as being in
> > conflict with any existing document, and I don't think this is a
> > productive discussion during last call of a very key document.
> 
> Curtis,
> In no way i was suggesting that we change the base spec.
> 
> My argument is that the flooding algorithm in the BGP spec is
> applicable to other NLRI-types. And that the document can be, and has
> been sucesfully used to do so. i.e. it is still a coherent document
> when you interpret it in the context of a different NLRI-type.
> 
> regards,
>   Pedro.


So you are not suggesting a change at all to BGP4.  If so you don't
need to involve the IDR WG in a semantic discussion.

If the issue is whether to use BGP4 for distribution of VPLS
information and the objection were along the lines of scaling, or some
other technical matter then that was not at all clear.  

Unless I missed something, IDR was added to the Cc mid discussion.  If
so, maybe you should tell us what draft you are discussing and be
clear about what the issue is.

Curtis



Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id RAA11007 for <idr-archive@nic.merit.edu>; Wed, 7 May 2003 17:23:21 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 51D4E912A1; Wed,  7 May 2003 17:20:54 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 1B0B1912A2; Wed,  7 May 2003 17:20:54 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id CB79F912A1 for <idr@trapdoor.merit.edu>; Wed,  7 May 2003 17:20:47 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id A86C45DE4B; Wed,  7 May 2003 17:20:47 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from workhorse.fictitious.org (workhorse.fictitious.org [209.150.1.230]) by segue.merit.edu (Postfix) with ESMTP id E86355DE49 for <idr@merit.edu>; Wed,  7 May 2003 17:20:46 -0400 (EDT)
Received: from workhorse.fictitious.org (localhost.fictitious.org [127.0.0.1]) by workhorse.fictitious.org (8.9.3/8.9.3) with ESMTP id RAA19579; Wed, 7 May 2003 17:20:25 -0400 (EDT) (envelope-from curtis@workhorse.fictitious.org)
Message-Id: <200305072120.RAA19579@workhorse.fictitious.org>
To: Pedro Roque Marques <roque@juniper.net>
Cc: Alex Zinin <zinin@psg.com>, ppvpn@nortelnetworks.com, idr@merit.edu
Reply-To: curtis@fictitious.org
Subject: Re: On BGP and VPLS 
In-reply-to: Your message of "Wed, 07 May 2003 13:58:58 PDT." <200305072058.h47Kww664593@roque-bsd.juniper.net> 
Date: Wed, 07 May 2003 17:20:25 -0400
From: Curtis Villamizar <curtis@fictitious.org>
Sender: owner-idr@merit.edu
Precedence: bulk

In message <200305072058.h47Kww664593@roque-bsd.juniper.net>, Pedro Roque Marqu
es writes:
> Pedro Roque Marques writes:
> 
> >>> The way i see it there is an high likely-hood of this turning into
> >>> an "Yes, it is" "No, it isn't" discussion. And I'd really like to
> >>> avoid that.
> 
> >> Agreed.
> 
> Following up on my own e-mail... i don't think it is in any way
> productive to continue the discussion torwards this path.
> 
> Lets try to turn the discussion around to the positive side:
> 
> Ignore VPLS for now.
> 
> Lets define a problem:
> 
> 1. A database consisting on entries (key, attr) needs to be propagated
> accross routers of the same domain and accross different
> administrative domains.
> 
> 2. An given key may be originated by more than one of the
> participating systems.
> 
> 3. A key advertised via a given member of a domain depends on
> reachability to that advertiser.
> 
> Task at hand is to find a solution to the problem above.
> 
>   Pedro.


Pedro,

You are welcome to consider BGP for this key distribution even if the
BGP spec does not match the terminology you are looking for.  The
terminology would not be the deciding factor.  If there are technical
problems with whatever you propose, that is a separate matter.

Now please drop IDR from the Cc and go about doing the ppvpn work,
whether it is VPLS or selecting a mechanism for your hypothetical
key/attr distribution.

Curtis


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id RAA10933 for <idr-archive@nic.merit.edu>; Wed, 7 May 2003 17:16:27 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id A44E291299; Wed,  7 May 2003 17:15:46 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 6DA129129D; Wed,  7 May 2003 17:15:46 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 3B4C291299 for <idr@trapdoor.merit.edu>; Wed,  7 May 2003 17:15:45 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 259785DE24; Wed,  7 May 2003 17:15:45 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from merlot.juniper.net (natint.juniper.net [207.17.136.129]) by segue.merit.edu (Postfix) with ESMTP id 9888F5DE12 for <idr@merit.edu>; Wed,  7 May 2003 17:15:44 -0400 (EDT)
Received: from roque-bsd.juniper.net (roque-bsd.juniper.net [172.17.12.183]) by merlot.juniper.net (8.11.3/8.11.3) with ESMTP id h47LFdu31459; Wed, 7 May 2003 14:15:39 -0700 (PDT) (envelope-from roque@juniper.net)
Received: (from roque@localhost) by roque-bsd.juniper.net (8.11.6/8.9.3) id h47LFdh64632; Wed, 7 May 2003 14:15:39 -0700 (PDT) (envelope-from roque)
Date: Wed, 7 May 2003 14:15:39 -0700 (PDT)
Message-Id: <200305072115.h47LFdh64632@roque-bsd.juniper.net>
From: Pedro Roque Marques <roque@juniper.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
To: curtis@fictitious.org
Cc: ppvpn@nortelnetworks.com, idr@merit.edu
Subject: Re: On BGP and VPLS 
In-Reply-To: <200305072106.RAA19411@workhorse.fictitious.org>
References: <200305072030.h47KUOC64557@roque-bsd.juniper.net> <200305072106.RAA19411@workhorse.fictitious.org>
X-Mailer: VM 6.34 under 19.16 "Lille" XEmacs Lucid
Sender: owner-idr@merit.edu
Precedence: bulk

Curtis Villamizar writes:

> Pedro,

> This is the BGP4 base spec.  The interpretation of the BGP
> advertisements are IP prefix aka NLRI for which routes are
> advertised and not general keys mapping to general records.

> If some extension of BGP4 such as VPN or PW makes some other use of
> BGP flooding then that's fine but need not be reflected in the base
> document.

> The changes that you are only vaguely specifying (s/route/record/g
> s/IP prefix/key/g) doesn't at all pertain to the use of BGP as
> defined in the base spec, hasn't been interpreted as being in
> conflict with any existing document, and I don't think this is a
> productive discussion during last call of a very key document.

Curtis,
In no way i was suggesting that we change the base spec.

My argument is that the flooding algorithm in the BGP spec is
applicable to other NLRI-types. And that the document can be, and has
been sucesfully used to do so. i.e. it is still a coherent document
when you interpret it in the context of a different NLRI-type.

regards,
  Pedro.


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id RAA10895 for <idr-archive@nic.merit.edu>; Wed, 7 May 2003 17:06:41 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id F1A0E9129C; Wed,  7 May 2003 17:06:19 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id C2FF69129D; Wed,  7 May 2003 17:06:18 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 86A0C9129C for <idr@trapdoor.merit.edu>; Wed,  7 May 2003 17:06:17 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 66E025DE1F; Wed,  7 May 2003 17:06:17 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from workhorse.fictitious.org (workhorse.fictitious.org [209.150.1.230]) by segue.merit.edu (Postfix) with ESMTP id 9A6995DE18 for <idr@merit.edu>; Wed,  7 May 2003 17:06:16 -0400 (EDT)
Received: from workhorse.fictitious.org (localhost.fictitious.org [127.0.0.1]) by workhorse.fictitious.org (8.9.3/8.9.3) with ESMTP id RAA19411; Wed, 7 May 2003 17:06:04 -0400 (EDT) (envelope-from curtis@workhorse.fictitious.org)
Message-Id: <200305072106.RAA19411@workhorse.fictitious.org>
To: Pedro Roque Marques <roque@juniper.net>
Cc: Alex Zinin <zinin@psg.com>, ppvpn@nortelnetworks.com, idr@merit.edu
Reply-To: curtis@fictitious.org
Subject: Re: On BGP and VPLS 
In-reply-to: Your message of "Wed, 07 May 2003 13:30:24 PDT." <200305072030.h47KUOC64557@roque-bsd.juniper.net> 
Date: Wed, 07 May 2003 17:06:04 -0400
From: Curtis Villamizar <curtis@fictitious.org>
Sender: owner-idr@merit.edu
Precedence: bulk

Pedro,

This is the BGP4 base spec.  The interpretation of the BGP
advertisements are IP prefix aka NLRI for which routes are advertised
and not general keys mapping to general records.

If some extension of BGP4 such as VPN or PW makes some other use of
BGP flooding then that's fine but need not be reflected in the base
document.

The changes that you are only vaguely specifying (s/route/record/g
s/IP prefix/key/g) doesn't at all pertain to the use of BGP as defined
in the base spec, hasn't been interpreted as being in conflict with
any existing document, and I don't think this is a productive
discussion during last call of a very key document.

Curtis


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id QAA10813 for <idr-archive@nic.merit.edu>; Wed, 7 May 2003 16:59:24 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id EB9129129B; Wed,  7 May 2003 16:59:05 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id BB0079129C; Wed,  7 May 2003 16:59:05 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 8AB169129B for <idr@trapdoor.merit.edu>; Wed,  7 May 2003 16:59:04 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 784425DE3D; Wed,  7 May 2003 16:59:04 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from merlot.juniper.net (natint.juniper.net [207.17.136.129]) by segue.merit.edu (Postfix) with ESMTP id F0BF45DE23 for <idr@merit.edu>; Wed,  7 May 2003 16:59:03 -0400 (EDT)
Received: from roque-bsd.juniper.net (roque-bsd.juniper.net [172.17.12.183]) by merlot.juniper.net (8.11.3/8.11.3) with ESMTP id h47Kwxu29839; Wed, 7 May 2003 13:58:59 -0700 (PDT) (envelope-from roque@juniper.net)
Received: (from roque@localhost) by roque-bsd.juniper.net (8.11.6/8.9.3) id h47Kww664593; Wed, 7 May 2003 13:58:58 -0700 (PDT) (envelope-from roque)
Date: Wed, 7 May 2003 13:58:58 -0700 (PDT)
Message-Id: <200305072058.h47Kww664593@roque-bsd.juniper.net>
From: Pedro Roque Marques <roque@juniper.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
To: Alex Zinin <zinin@psg.com>
Cc: ppvpn@nortelnetworks.com, idr@merit.edu
Subject: Re: On BGP and VPLS
In-Reply-To: <200305072030.h47KUOC64557@roque-bsd.juniper.net>
References: <51133594448.20030502191439@psg.com> <200305030805.h4385Kd51107@roque-bsd.juniper.net> <6857870813.20030507112815@psg.com> <200305072030.h47KUOC64557@roque-bsd.juniper.net>
X-Mailer: VM 6.34 under 19.16 "Lille" XEmacs Lucid
Sender: owner-idr@merit.edu
Precedence: bulk

Pedro Roque Marques writes:

>>> The way i see it there is an high likely-hood of this turning into
>>> an "Yes, it is" "No, it isn't" discussion. And I'd really like to
>>> avoid that.

>> Agreed.

Following up on my own e-mail... i don't think it is in any way
productive to continue the discussion torwards this path.

Lets try to turn the discussion around to the positive side:

Ignore VPLS for now.

Lets define a problem:

1. A database consisting on entries (key, attr) needs to be propagated
accross routers of the same domain and accross different
administrative domains.

2. An given key may be originated by more than one of the
participating systems.

3. A key advertised via a given member of a domain depends on
reachability to that advertiser.

Task at hand is to find a solution to the problem above.

  Pedro.


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id QAA10693 for <idr-archive@nic.merit.edu>; Wed, 7 May 2003 16:32:39 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 9D3C89129A; Wed,  7 May 2003 16:31:38 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 689D09129B; Wed,  7 May 2003 16:31:38 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id C0DC39129A for <idr@trapdoor.merit.edu>; Wed,  7 May 2003 16:31:36 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id ABF695DDAA; Wed,  7 May 2003 16:31:36 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from merlot.juniper.net (natint.juniper.net [207.17.136.129]) by segue.merit.edu (Postfix) with ESMTP id 3F7EA5DE2A for <idr@merit.edu>; Wed,  7 May 2003 16:31:36 -0400 (EDT)
Received: from roque-bsd.juniper.net (roque-bsd.juniper.net [172.17.12.183]) by merlot.juniper.net (8.11.3/8.11.3) with ESMTP id h47KUOu27778; Wed, 7 May 2003 13:30:24 -0700 (PDT) (envelope-from roque@juniper.net)
Received: (from roque@localhost) by roque-bsd.juniper.net (8.11.6/8.9.3) id h47KUOC64557; Wed, 7 May 2003 13:30:24 -0700 (PDT) (envelope-from roque)
Date: Wed, 7 May 2003 13:30:24 -0700 (PDT)
Message-Id: <200305072030.h47KUOC64557@roque-bsd.juniper.net>
From: Pedro Roque Marques <roque@juniper.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
To: Alex Zinin <zinin@psg.com>
Cc: ppvpn@nortelnetworks.com, idr@merit.edu
Subject: Re: On BGP and VPLS
In-Reply-To: <6857870813.20030507112815@psg.com>
References: <51133594448.20030502191439@psg.com> <200305030805.h4385Kd51107@roque-bsd.juniper.net> <6857870813.20030507112815@psg.com>
X-Mailer: VM 6.34 under 19.16 "Lille" XEmacs Lucid
Sender: owner-idr@merit.edu
Precedence: bulk

Alex Zinin writes:

> Pedro, Sorry for the delay. I found your message in my IDR folder
> just recently. I'll limit my involvement to this message, I think I
> will have said enough for my opinion to be heard :)

Alex,

>> The way i see it there is an high likely-hood of this turning into
>> an "Yes, it is" "No, it isn't" discussion. And I'd really like to
>> avoid that.

> Agreed.

It seems to me that we immediatly got into this dead-lock point. Allow
me to make a further attempt to clarify my point of view.

>> This point seems to be predicated in the statement that "BGP uses
>> the NLRI field to carry IP reachability"...

> Yes, plus that "BGP is an IP routing protocol."

That is one possible application of BGP. It does not invalidate that
possibility of other applications.

Let me rephrase my point:

It is possible to extract from the BGP specification a common set of
functionality that is independent of 'IP routing'. Such common set of
functionality consists on the ability to flood databases of
information accross multiple domains, in a distributed fashion.

Since this problem reoccurs often in networking, several other
applications have made use of this commonality to avoid reinventing
the wheel.

The 'common functionality' in question is not so much what is
documented in the BGP specification but the algorithms required to
implement it.

> This is where we disagree in the first place.

> There's a fine line between a database distribution flooding
> algorithm and a path-vector routing algorithm. BGP is clearly not
> the former.

"Yes, it is" :-)
Give me an argument and i'll try a more productive response.

>> As an exercise, if we take the existing spec and do:
>> s/route/record/g s/IP prefix/key/g

>> Do we still have a document that makes sense.. ?

> I'm not sure :)

>> Except for the vague bits about aggregation, about which BGP itself
>> does little about, i would contend that the result would be pretty
>> much the same.

> You are forgetting the parts about Loc-RIB, routing and forwarding
> tables, next-hop, etc. In any case, such a beast would seize to be
> an IP routing protocol, but would still perform best path selection
> to an opaque key in the Internet. If you need a database
> distribution mechanism as you described above, you don't need the
> path vector behavior, nor per-peer state for each key, nor
> next-hops. I.e., you don't need what BGP does.

You repeat this argument throughout your e-mail. And i'm completly
missing your point:

The path-vector algorithm is essential to the flooding of the BGP
database information. Without it flooding would eternally loop.

path-vector as in:
 o as-path and cluster-list loop detection
 o iBGP doesn't advertise iBGP

This is what makes BGP flooding work. None of this information is used
for Loc-rib purposes.

>>>  2. Distribution of information

[...]

>> P routers do not carry 2547 routing information.

> Unless the SPs want to use the existing RR infrastructure.

All the SPs i've worked with explicitly do not want to do this.

>> Not really... i can advertise the same key from multiple sources in
>> L2VPNs also. All policy mechanisms do work... igp distance, etc. It
>> is just the semantics once the path is selected that are different.
>> As an example think working and protect PE for a given emulated
>> circuit (or lan).

> I think you might have missed my point here. Though a BGP speaker
> will receive the same key from multiple sources, and will select the
> best path to it, in the VPLS case, it is not interested in the
> _best_ path, it is interested in only receiving the key, regardless
> of where it comes from.

That is factually incorrect. I gave you an example in the original
e-mail to illustrate the point. The same 'key' maybe advertised from
more than one source. BGP needs to select the best-path in any such
occurence.


> So RR and eBGP are the examples where a given BGP speaker would
> receive extra copies of information. As I replied to Mark:

>     Well, in the IP/BGP case, it is not necessarily the same info,
> as path attributes are likely to be different and the BGP speaker is
> interested in selecting the best among them while preserving others
> as the back-up. In the VPLS case, we don't need the best, just one
> copy is sufficient.

That is incorrect.

PE1 advertises 'key' preference 100, label 10000 - this is working
circuit.
PE2 advertises 'key' preference 200, label 10001 - this is protect
circuit.

It works just like IP routing. All the preference mechanisms that you
use in IP routing can be used here. IGP failure to PE1 for instance,
causes a remote system to automatically reroute.


> This does not change the fact that information that BGP as an IP
> routing protocol distributes is aggregateable.

That doesn't mean that BGP does aggregate anything either. BGP doesn't
specify any algorithm for aggregation itself. What BGP does address is
the interaction between possible aggregation and its loop detection
mechanism.


> Also note that route
> aggregation rules are part of the routing protocol specification and
> definitely depend on the protocol behavior (true for BGP, OSPF, and
> ISIS), so it is not a completely distinct notion, though some
> implementations decouple the two.

BGP does not specify any rules for what routes should be aggregated or
into what they should be aggregated. It cannot since these are
operational decisions.

The statement above is false as far as i can understand it.

>>>  5. Coupling of VPLS and BGP SW

>>>     a) Lesser BGP code stability--bugs in the VPLS part of the
>>> code

>> You have no basis to conclude that.

> I do :)

Please present a justification then.
We are back to "no, it isn't" / "yes, it is" mode.


> The fact is that pieces of code in routing protocol implementations
> are not only statically related via the function call tree, but also
> dynamically and indirectly... but I will stop right here, because
> we'll inevitably get into implementation specifics...

This is what i call FUD...
"Uncertainty" not being backed by any argument.

The way i see it this is a central piece of your (IESG) original
"concern" statement.

>>>     b) Potential dynamic effects--since with a BGP-based approach,

>> I'm sorry but this is just FUD.

> I hope people don't think about potential interference between large
> distributed systems as FUD.

You are wording it just that way: "potential interference between large
distributed systems".

No facts, no specific points. Just vague allegations.

If the above is "fair game" then i want to ask the IESG to use the
same "potential interference" criterium to, say, IPv6...

IPv6 not only involves changes to BGP to support this NLRI but changes
to all hosts and pretty much all protocols and applications. It can
"dynamically and indirectly" cause interference too.

Perhaps IPv6 is a treat to the internet stability ? Since i don't
intend to run IPv6 in the forseable future it standardization of IPv6
is going cause 'interference' with my workstation software, which
i rather not deal with.

>    I tend to look at this more broadly--putting VPLS functionality
> in BGP increases the chances of interference.  Some consider this a
> strictly implementation-specific issue.  I think that whether or not
> VPLS-specific functionality is sufficiently decoupled from base BGP
> is an implementation aspect; while increased risk of interference is
> an architectural one.

I believe that this completly confuses architecture with
implementation.

This is not an architectural consideration. This is a vague aspersion
on the competency of BGP implementors.

Your argument taken to the extreme would result in forbidding all and
any standardization of new software mechanisms.

  Pedro.


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id OAA09071 for <idr-archive@nic.merit.edu>; Wed, 7 May 2003 14:31:44 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id E26BF91290; Wed,  7 May 2003 14:31:04 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id A59E491292; Wed,  7 May 2003 14:31:04 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id EFB7D91290 for <idr@trapdoor.merit.edu>; Wed,  7 May 2003 14:31:02 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id DC10A5DDE9; Wed,  7 May 2003 14:31:02 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from psg.com (psg.com [147.28.0.62]) by segue.merit.edu (Postfix) with ESMTP id 638C65DDE4 for <idr@merit.edu>; Wed,  7 May 2003 14:31:02 -0400 (EDT)
Received: from psg.com ([147.28.0.62] helo=127.0.0.1) by psg.com with esmtp (Exim 3.36 #1) id 19DTgM-000IVV-00; Wed, 07 May 2003 18:30:22 +0000
Date: Wed, 7 May 2003 11:28:15 -0700
From: Alex Zinin <zinin@psg.com>
X-Mailer: The Bat! (v1.62i) Personal
Reply-To: Alex Zinin <zinin@psg.com>
X-Priority: 3 (Normal)
Message-ID: <6857870813.20030507112815@psg.com>
To: Pedro Roque Marques <roque@juniper.net>
Cc: ppvpn@nortelnetworks.com, idr@merit.edu
Subject: Re: On BGP and VPLS
In-Reply-To: <200305030805.h4385Kd51107@roque-bsd.juniper.net>
References: <51133594448.20030502191439@psg.com> <200305030805.h4385Kd51107@roque-bsd.juniper.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-idr@merit.edu
Precedence: bulk

Pedro,

Sorry for the delay. I found your message in my IDR folder just
recently. I'll limit my involvement to this message, I think I will
have said enough for my opinion to be heard :)

> This point seems to be predicated in the statement that "BGP uses the
> NLRI field to carry IP reachability"...

Yes, plus that "BGP is an IP routing protocol."

> It opens up a sort of philosophical discussion on BGP. This is of
> course a highly subjective topic which is hard to quantify or to prove
> by logical terms.

> Allow me to present my personal view.

> BGP is a particular implementation of an algorithm that performs non
> looping database flooding distribution. That algorithm consists mostly
> on the path vector (used both in ebgp and route reflection) plus route
> advertisement rules. This is the publicly specified part of the beast.

This is where we disagree in the first place.

There's a fine line between a database distribution flooding algorithm
and a path-vector routing algorithm. BGP is clearly not the former.

> However that ends up being about 10% of the database exchange
> algorithm. Each implementation uses distinct algorithms to do the real
> heavy lifting: the advertisement of database updates to its peers,
> given that each peer is allowed to flow control and that the ammount
> of information to be distributed is typically non-trivial compared to
> the resources of the system.

> None of the functions above actually do depend on the format of your
> database records. As long as there is a primary key associated with
> each record. Modern implementations, given that they are required to
> handle 3/4 different types of records w/ different keys (ipv4, ipv6,
> 2547, 2547-for-ipv6, etc) will tend to treat these keys just as
> database systems do: as a bit string without any semantics associated
> w/ it.

> Note also that the number of distinct tables exchanged in a 2547
> implementation may be in the thousands. So segregation of which record
> belongs to which table is necessarily a solved problem in practice.

> There is one part of BGP that however interacts w/ the semantics of
> the particular database being exchanged: route selection from the
> Loc-RIB.

> The Loc-RIB is by definition where BGP interacts w/ remaining users of
> the database and it includes rules that are system specific.

> As an exercise, if we take the existing spec and do:
> s/route/record/g
> s/IP prefix/key/g

> Do we still have a document that makes sense.. ?

I'm not sure :)

> Except for the vague bits about aggregation, about which BGP itself
> does little about, i would contend that the result would be pretty
> much the same.

You are forgetting the parts about Loc-RIB, routing and forwarding
tables, next-hop, etc. In any case, such a beast would seize to be an
IP routing protocol, but would still perform best path selection to an
opaque key in the Internet. If you need a database distribution
mechanism as you described above, you don't need the path vector
behavior, nor per-peer state for each key, nor next-hops. I.e., you
don't need what BGP does.

> 2547 which you cite is a particular good example, imho. A 2547 NLRI
> ends up being used to create IP reachability information, but while it
> is a safi 128 record, it is not IP reachability and it is not treated
> as such.

2547 augments _IP prefixes_ with route distinguishers only to make
sure that the prefix is unique when used as the key in the RIB. The
rest of the reachability/prefix semantics are preserved.

>>  2. Distribution of information

> That is not the case w/ 2547. PE routers typically have interest in
> only a subset of the routing information. They tend to do inbound
> filtering in current network deployements but one can also do outbound
> filtering in the RRs via either extended-community ORF or subsequent
> improvements to ORF (draft-marques-ppvpn-rt-contrain).

Note that I purposely compared the approach in the document with
the base BGP, not 2547.

> P routers do not carry 2547 routing information.

Unless the SPs want to use the existing RR infrastructure.

> RR in VPN deployments are typically not in the topology. My
> understanding of the P-router term is that it is a transit node that
> does not have VPN information.

Agree, they don't have to. The text should have been "More than that,
route reflectors (in some cases implemented on P routers) end up..."

> Not really... i can advertise the same key from multiple sources in
> L2VPNs also. All policy mechanisms do work... igp distance, etc. It is
> just the semantics once the path is selected that are different.
> As an example think working and protect PE for a given emulated
> circuit (or lan).

I think you might have missed my point here. Though a BGP speaker will
receive the same key from multiple sources, and will select the best
path to it, in the VPLS case, it is not interested in the _best_ path,
it is interested in only receiving the key, regardless of where it
comes from.

> I don't know which model you have in mind but in a typical VPN
> deployment scenario (l3 or l2/vpls/etc) a PE has 2 peering sessions to
> a RR outside of the topology. The second copy of the information is
> there for redudancy...

> If a full mesh where used, only 1 copy would be present.

So RR and eBGP are the examples where a given BGP speaker would
receive extra copies of information. As I replied to Mark:

    Well, in the IP/BGP case, it is not necessarily the same info,
    as path attributes are likely to be different and the BGP
    speaker is interested in selecting the best among them while
    preserving others as the back-up. In the VPLS case, we don't
    need the best, just one copy is sufficient.

    This is not necessarily an issue per se, rather an interesting
    observation exposing the transport nature of this particular
    proposed application of BGP. We have similar properties (more
    than one copy received) in the flooding algorithm in IGPs, but
    we admit that flooding is a transport component very specific
    to IGPs (where every node needs all info, btw), and we don't
    keep track of where we receive PDUs from as we do in BGP.

>>  3. Aggregation of information for large-scale operation
...
> To give you an example, in JunOS aggregation is implemented as a
> separate routing protocol... if i'm not mistaken the model is lifted
> from 'gated'. Clearly the idea that aggregation may be a distinct
> component from BGP has been around for a while.

This does not change the fact that information that BGP as an IP
routing protocol distributes is aggregateable. Also note that route
aggregation rules are part of the routing protocol specification and
definitely depend on the protocol behavior (true for BGP, OSPF, and
ISIS), so it is not a completely distinct notion, though some
implementations decouple the two.

> VPLS doesn't really need aggregation although it does use an IGP :-)
> PE to PE connectivity is performed indepently from the 'forwarding
> distinguisher' advertisement (i.e. the inner label). Any or multiple
> routing and singaling protocols may be used for this
> functionality. Only the information exterior to the SP network
> (service attachements) is carried through BGP.

This part should go into the thread on "Info Summarization" where one
of the questions is how we can limit the amount of state/information
that a given participating node will have to maintain.

I'd like to again highlight the difference between aggregateable
semantics of NLRI contents in BGP when used as IP routing protocols,
and non-aggregateable semantics of it in the proposal, which means
that mechanisms different from those existing in current BGP practices
would need to be used to limit the amount of maintained info.

>>  The above gives me a very uncomfortable feeling that the proposal
>> is stretching BGP to perform functions it was not designed for.

> Any succesful protocol will be used for means other than it was
> designed for. That is usually a sign that the designers got something
> right.

I was being mild.

>>  4. Backwards compatibility and SW upgrade requirements

> That is not an issue as we've seen above. The deployment model is
> different from what you assume.

As before, unless the SPs want to use the existing RR infrastructure.

>>  5. Coupling of VPLS and BGP SW

>>     a) Lesser BGP code stability--bugs in the VPLS part of the code

> You have no basis to conclude that.

I do :)

> Any modern BGP implementation worth its salt consists of
> AF-independent code + AF-specific code. The fact is that you can
> implement VPLS without touching the AF-independent code.

The fact is that pieces of code in routing protocol implementations
are not only statically related via the function call tree, but also
dynamically and indirectly... but I will stop right here, because
we'll inevitably get into implementation specifics...

>>     b) Potential dynamic effects--since with a BGP-based approach,

> I'm sorry but this is just FUD.

I hope people don't think about potential interference between large
distributed systems as FUD.

> All router implementations do have some level of resource sharing
> between completly unrelated features. In some of them, all
> functionality shares all resources.

Agreed, though I was talking about tighter coupling when Inet BGP and
VPLS BGP are in the same process/thread (which is very likely the
case). As I told in my answer to Mark:

   I tend to look at this more broadly--putting VPLS
   functionality in BGP increases the chances of interference.
   Some consider this a strictly implementation-specific issue.
   I think that whether or not VPLS-specific functionality is
   sufficiently decoupled from base BGP is an implementation
   aspect; while increased risk of interference is an architectural
   one.

Again, I'll stop here too.
   
>>  My recommendation would be for the WG to consider these points.

> The way i see it there is an high likely-hood of this turning into an
> "Yes, it is" "No, it isn't" discussion. And I'd really like to avoid
> that.

Agreed.

> A question to you and to the WG(s) in general:

> - What are the main concerns that you have w/ the generic database
> exchange view of BGP (Lets call it the "Basically General Purpose"
> theory).

The fact that BGP is not a generic database exchange protocol,
and I don't think it should be positioned as such.

> - Can we have a reasonable discussion about the best engineering
> approach to provide database exchange services for
> routing-related-applications without getting into a religious argument
> about "2547 is evil" ? i.e. can we try to separate how highly each one
> of us rates the actual application from this discussion ?

I think we'll have to agree on the definition of "database exchange
services" and "routing-related-applications", but generally, yes,
sure.

> - I believe one of the preconditions for a resonable discussion is to
> realise that implementors are the most interested people in not
> introducing regressions to shipping code. They actually get to fix it
> after being screamed at for a considerable lenght of time.
> I'd really like to get past the "you can't implement a feature i don't
> want because your are going to break the code" kind of discussion.

I think there is a generally good understanding of this. I don't think
this is something that is sufficient for the IETF to base technical
conclusions on.

> - Are we going to have a similiar discussion about LDP ? LDP is not
> any less relevant for network stability nor a protocol which is any
> simpler than BGP (if anything the level of complexity is higher given
> that LDP has all the db exchange problem of BGP + a non trivial
> ammount of issues of its own).

I have absolutely no problem with this.

Thanks for your comments.

Alex



Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id TAA01494 for <idr-archive@nic.merit.edu>; Mon, 5 May 2003 19:43:12 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 0BC8C9122B; Mon,  5 May 2003 19:42:35 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id CDABB9122D; Mon,  5 May 2003 19:42:34 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 382099122B for <idr@trapdoor.merit.edu>; Mon,  5 May 2003 19:42:32 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 2267C5E29F; Mon,  5 May 2003 19:42:32 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from psg.com (psg.com [147.28.0.62]) by segue.merit.edu (Postfix) with ESMTP id 752B45E240 for <idr@merit.edu>; Mon,  5 May 2003 19:42:31 -0400 (EDT)
Received: from psg.com ([147.28.0.62] helo=127.0.0.1) by psg.com with esmtp (Exim 3.36 #1) id 19CpbJ-000FaJ-00; Mon, 05 May 2003 23:42:29 +0000
Date: Mon, 5 May 2003 16:38:15 -0700
From: Alex Zinin <zinin@psg.com>
X-Mailer: The Bat! (v1.62i) Personal
Reply-To: Alex Zinin <zinin@psg.com>
X-Priority: 3 (Normal)
Message-ID: <177177649135.20030505163815@psg.com>
To: idr@merit.edu
Cc: rtg-dir@ietf.org
Subject: AD-review comments on draft-ietf-idr-bgp4-20
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Sender: owner-idr@merit.edu
Precedence: bulk

Folks,

 Please find below my AD-review comments. Hopefully they will help
 improve the document. I tried to consult Andrew's list as much as
 possible, but do feel free to point out if something has already been
 discussed and agreed upon.
 
 Thanks go to Yakov for kicking me often enough ;)

--
Alex Zinin

Some nits:
- run it by a spelling checker, please
- disable hyphenation if possible
- include boilerplates for IPR notice, Copyright notice

General comment:

  in some places I highlighted the fact that required behavior is not
  described using the 2119 language, so it is not clear if a MUST or
  SHOULD or MAY is applicable. I am sure I've missed some more places
  like this. I'd like to ask the editors to go through the doc and
  check this.

>                   A Border Gateway Protocol 4 (BGP-4)
>                       <draft-ietf-idr-bgp4-20.txt>
> 
> 
> Status of this Memo
> 
> 
...
>    The list of Internet-Draft Shadow Directories can be accessed at
>    http://www.ietf.org/shadow.html.
>
> Specification of Requirements

Nit: move Abstract here. Move requirements after the Acks.

> Abstract

Should the Abstract say that this spec covers IPv4 only?


> 3. Summary of Operation
...
>    This document uses the term `Autonomous System' (AS) throughout.  The
>    classic definition of an Autonomous System is a set of routers under
>    a single technical administration, using an interior gateway protocol
>    (IGP) and common metrics to determine how to route packets within the
>    AS, and using an inter-AS routing protocol to determine how to route
>    packets to other ASs. Since this classic definition was developed, it
>    has become common for a single AS to use several IGPs and sometimes
>    several sets of metrics within an AS. The use of the term Autonomous
>    System here stresses the fact that, even when multiple IGPs and met-
>    rics are used, the administration of an AS appears to other ASs to
>    have a single coherent interior routing plan and presents a consis-
>    tent picture of what destinations are reachable through it.

Ed: Since 'AS' has been defined before, do we need to repeat the
definition here?

...
>    peer in the same AS is referred to as an internal peer. Internal BGP
>    and external BGP are commonly abbreviated IBGP and EBGP.

Ed: These two have been defined before too

...
> Care must be taken to
>    ensure that the interior routers have all been updated with transit
>    information before the BGP speakers announce to other ASs that tran-
>    sit service is being provided.

What does the last sentence really mean from the implementation
perspective? It used to mean the BGP/IGP synchronization check. Now
that iBGP everywhere is assumed, how do we check this condition?

>    This document specifies the base behavior of the BGP protocol. This
>    behavior can and is modified by extention specifications.  When the
Ed: "extension"

>    protocol is extended the new behavior is fully documented in the
>    extention specifications.
Ed: "extension"

> 3.1 Routes: Advertisement and Storage
> 
>    For the purpose of this protocol, a route is defined as a unit of
>    information that pairs a set of destinations with the attributes of a
>    path to those destinations. The set of destinations are systems whose
>    IP addresses are contained in one IP address prefix carried in the
>    Network Layer Reachability Information (NLRI) field of an UPDATE mes-
>    sage, and the path is the information reported in the path attributes
>    field of the same UPDATE message.
Ed: Repeated definition again
...
>    If a BGP speaker chooses to advertise the route, it MAY add to or
>    modify the path attributes of the route before advertising it to a
>    peer.

The intent here is to say that it's ok to modify the attribute set of
a previously received route when it's announced further. The way it
reads though is that self-originated routes are also within the
context and MAY sounds like you don't have to add attributes when
announcing those.

...

>    Changing attribute of a route is accomplished by advertising a
>    replacement route. The replacement route carries new (changed)
>    attributes and has the same NLRI as the original route.

"same NLRI" implies the same prefix, but not the NLRI field, which can
be different (containing other routes), should the use of this term be
normalized throughout the document?

> 4.2 OPEN Message Format
> 
>    After a TCP is established, the first message sent by each side is an

"TCP connection"

> 5. Path Attributes
...
>    If a path with recognized transitive optional attribute is accepted
>    and passed along to other BGP peers and the Partial bit in the
>    Attribute Flags octet is set to 1 by some previous AS, it is not 

'MUST NOT' here?

> set
>    back to 0 by the current AS. Unrecognized non-transitive optional
>    attributes MUST be quietly ignored and not passed along to other BGP
>    peers.
...
>    The same attribute (attribute with the same type) can not appear more
>    than once within the Path Attributes field of a particular UPDATE
>    message.

What should an implementation do if this happens?

>    The mandatory category refers to an attribute which MUST be present
>    in both IBGP and EBGP exchanges if NLRI are contained in the UPDATE

Ed: "if the NLRI field is contained" instead?

> 5.1.2 AS_PATH
...
>       b) When a given BGP speaker advertises the route to an external
>       peer, then the advertising speaker updates the AS_PATH attribute
>       as follows:
> 
>          1) if the first path segment of the AS_PATH is of type
>          AS_SEQUENCE, the local system prepends its own AS number as the
>          last element of the sequence (put it in the leftmost position).

'Leftmost position'... isn't this still open for interpretation? How
about wording this relative to the position of the octets in the
protocol message?

>          If the act of prepending will cause an overflow in the AS_PATH
>          segment, i.e. more than 255 ASs, it is legal

What's the recommended behavior here?

>          to prepend a new
>          segment of type AS_SEQUENCE and prepend its own AS number to
>          this new segment.


> 5.1.4 MULTI_EXIT_DISC
> 
> 
>    The MULTI_EXIT_DISC is an optional non-transitive attribute which is
>    intended to be used on external (inter-AS) links to discriminate
>    among multiple exit or entry points to the same neighboring AS.  The
>    value of the MULTI_EXIT_DISC attribute is a four octet unsigned num-
>    ber which is called a metric. All other factors being equal, the exit
>    point with lower metric SHOULD be preferred. If received over EBGP,
>    the MULTI_EXIT_DISC attribute MAY be propagated over IBGP to other
>    BGP speakers within the same AS. The MULTI_EXIT_DISC attribute

seems that a reference to 9.1.2.2 is due here, as using MED in local
route calculation and not propagating it further is dangerous

>    received from a neighboring AS MUST NOT be propagated to other neigh-
>    boring ASs.
> 
>    A BGP speaker MUST IMPLEMENT a mechanism based on local configuration
                        ^^^^^^^^^lower-case
                        
>    which allows the MULTI_EXIT_DISC attribute to be removed from a
>    route. This MAY be done prior to determining the degree of preference

what's the recommended behavior here?

>    of the route and performing route selection (decision process phases
>    1 and 2).
> 
>    An implementation MAY also (based on local configuration) alter the
>    value of the MULTI_EXIT_DISC attribute received over EBGP.  This MAY
>    be done prior to determining the degree of preference of the route

what's the recommended behavior here?

> 5.1.5 LOCAL_PREF
...
> A BGP speaker SHALL calculate the degree of preference for
>    each external route based on the locally configured policy, and

Should we be more honest here and say that the implementation must
allow the admin to SET the degree of preference through the local
policy to influence the best-path selection process, i.e., I don't
think any implementation really *calculates* it.

> 5.1.6 ATOMIC_AGGREGATE
...
>    A BGP speaker that receives a route with the ATOMIC_AGGREGATE
>    attribute MUST NOT make any NLRI of that route more specific (as
>    defined in 9.1.4) when advertising this route to other BGP speakers.

Since deaggregation is not described in this document, do we need this
para?

>   A BGP speaker that receives a route with the ATOMIC_AGGREGATE
>    attribute needs to be cognizant of the fact that the actual path to
>    destinations, as specified in the NLRI of the route, while having the
>    loop-free property, may not be the path specified in the AS_PATH
>    attribute of the route.

What does this really mean from the implementation perspective?

> 5.1.7 AGGREGATOR
> 
> 
>    AGGREGATOR is an optional transitive attribute which MAY be included
>    in updates which are formed by aggregation (see Section 9.2.2.2). A
>    BGP speaker which performs route aggregation MAY add the AGGREGATOR

What's the recommended behavior here? Include or not, and under what
circumstances?

> 6. BGP Error Handling.
...
>    The phrase "the BGP connection is closed" means that the TCP connec-
>    tion has been closed, the associated Adj-RIB-In has been cleared, and
>    that all resources for that BGP connection have been deallocated.
>    Entries in the Loc-RIB associated with the remote peer are marked as
>    invalid. The fact that the routes have become invalid is passed to
>    other BGP peers before the routes are deleted from the system.

What does "the fact is passed" mean? Should we instead say that local
route recalculation happens and peers are sent either updated best
routes or withdrawals?

> 6.2 OPEN message error handling.
...
>    If the Autonomous System field of the OPEN message is unacceptable,
>    then the Error Subcode is set to Bad Peer AS. The determination of
>    acceptable Autonomous System numbers is outside the scope of this
>    protocol.

Shouldn't we say that configuration based detection should be
supported, i.e., when remote-as is configured for the peer?

...
>   If the BGP Identifier field of the OPEN message is syntactically
>    incorrect, then the Error Subcode is set to Bad BGP Identifier.  Syn-
>    tactic correctness means that the BGP Identifier field represents a
>    valid IP host address.

Is "valid IP host address" defined somewhere, btw?

> 6.3 UPDATE message error handling.
> 
> 
>    All errors detected while processing the UPDATE message are indicated
>    by sending the NOTIFICATION message with Error Code UPDATE Message
>    Error. The error subcode elaborates on the specific nature of the
>    error.

"are indicated..." is this a MUST, SHOULD, or MAY?
...
>    If the ORIGIN attribute has an undefined value, then the Error Sub-
>    code is set to Invalid Origin Attribute. The Data field contains the
>    unrecognized attribute (type, length and value).

Curious: do we really have to drop a session on this condition? Given
that the attribute was syntactically correct and the TLV was not
broken, so the stream is still in sync and we can move on? Of course,
if this is what current implementations do, we have no other choice.

...
>    If the UPDATE message is received from an external peer, the local
>    system MAY check whether the leftmost AS in the AS_PATH attribute is

Same comment about 'leftmost'... Maybe we should define this somewhere
in the beginning of the spec?

...
>    The NLRI field in the UPDATE message is checked for syntactic valid-
>    ity. If the field is syntactically incorrect, then the Error Subcode
>    is set to Invalid Network Field.

Should we give more data on what syntactic validity means in this case
so people behave consistently?

> 6.7 Cease.
...
> If the BGP speaker decides to terminate its BGP
>    connection with a neighbor because the number of address prefixes
>    received from the neighbor exceeds the locally configured upper
>    bound, then the speaker MUST send to the neighbor a NOTIFICATION mes-
>    sage with the Error Code Cease.

Should we also say that when the peer decides to discard incoming
prefixes, this event should be logged locally?


> 8. BGP Finite State machine

General comment: I would _really_ appreciate more people looking at this
section.

>    The optional Session attributes are listed below. These optional
>    attributes may be supported either per connection or per local sys-
>    tem:
> 
>         1) Delay Open flag

Where's the description of this flag and how/when is it set? Same for
others below. Should we have a brief description for each attribute?

>         2) Open Delay Timer
>         3) Perform automatic start flag
>         4) Perform automatic stop flag
>         5) Passive TCP establishment flag
>         6) Perform BGP peer oscillation damping flag
>            (which will be denoted as stop_peer_flap in text)
>         7) Idle Hold timer
>         8) Perform Collision detect in Established flag
>         9) Accept connections from un-configured peers
>        10) Track TCP state flag
>        11) Send NOTIFICATION without an OPEN flag
> 
Suggestion: to make reading of the FSM description below easier, we
could "merge" the multiword flag names and normalize them, e.g.
'perform automatic start flag' to 'PerformAutoStart flag'. 'Passive
TCP establishment flag' to 'PassiveTCPEstablishment flag',
'stop_peer_flap' to 'StopPeerFlag'.

> 8.1.1 Administrative Events
> 
> 
>    Please note that only Event 1 (manual start) and Event 2 (manual
>    stop) are mandatory administrative events. All other administrative
>    events are optional. The optional attributes do not have to be sup-
>    ported. However, if these attributes are supported, the state of the
>    flags should be as indicated.

'flags should be as indicated' does not give a clear understanding of
what they are used for. Should the events be sanity-checked by
checking those attributes? what's the recommended behavior when the
flags are in a different state?

>        Event3: Automatic start
> 
>               Definition: Local system automatically starts the
>                           BGP connection.
> 

When is this event generated by the system? Under what conditions?
>               Status:     Optional depending on local system.
> 
>               Optional
>               attributes: 1) Perform automatic start flag SHOULD be set
>                              if this event occurs.
>                           2) if the passive Passive TCP establishment flag

passive Passive?

>        Event5: Automatic start with passive TCP flag
> 
>               Definition: Local system automatically starts the
>                           BGP connection with the passive flag
>                           enabled.  The passive flag indicates
> 

Same question about generation conditions
..
>        Event23: Open collision dump
> 
>               Definition: An event generated administratively
>                           when a connection collision has been
>                           detected while processing an incoming
>                           OPEN message and this connection has been
>                           selected to disconnected. See Section
'to be disconnected'
>                           6.8 for more information on collision
>                           detection.
> 
>                           Event23 is an administrative based only
'based on'?
>                           implementation specific policy. This
>                           Event may occur if the FSM is implemented
>                           as two linked state machines.
> 
> 
>               Status:     Optional, depending on local system
> 
>               Optional
>               Attributes: If the state machine is to process this
>                           attribute in Established state,
>                            1) Peform Collision detect in Established
'Perform'
>                                flag SHOULD be set.

...

>        Event25: NotifMsg
> 
>               Definition: An event is generated when a
>                           NOTIFICATION messages is received and
message
>                           the error code is anything but
>                           "version error".
> 
>               Status:     Mandatory


> 8.2.1 FSM Definition
> 
> 
>    BGP MUST maintain a separate FSM for each configured peer, Each BGP
>    peer paired in a potential connection unless configured to remain in
>    the idle state, or configured to remain passive, will attempt to  to
to to

>    connect to the other.  For the purpose of this discussion, the active
>    or connect side of the TCP connection (the side of a TCP connection
'active or connecting'?

>    sending the first TCP SYN packet) is called outgoing.  The passive or
>    listening side (the sender of the first SYN ACK) is called an incom-
>    ing connection (see Section 8.2.1.1 on the terms active and passive
>    below).
> 
>    A BGP implementation MUST connect to and listen on TCP port 179 for
>    incoming connections in addition to trying to connect to peers.  For
>    each incoming connection, a state machine MUST be instantiated.

Is this true for implementations that resolve connection collision
through one FSM with two transport connections?

> 8.2.1.1 Terms "active" and "passive"
> 
> 
>    The terms active and passive have been in our vocabulary for almost a
>    decade and have proven useful.  

Ed: The style here is quite different from the rest of the document
(i.e., personalization), plus time values tend become outdated with
time :)

> 8.2.1.2 FSM and collision detection
> 
> 
>    There is one FSM per BGP connection.  Prior to determining what peer
>    a connection is associated with there may be two connections for a
>    given peer.  There SHOULD be no more than one connection per peer.

Is above "SHOULD" normative? I.e., should be "should" instead?

>   The collision detection identifies the case where there is more than
>    one connection per peer and provides guidance for which connection to
>    get rid of.  When this occurs, the corresponding FSM for the connec-
>    tion that is closed SHOULD be disposed of.
> 

BTW, I think the specification would really benefit from a section
that describes processing of incoming transport connections.

> 8.2.2 Finite State Machine
> 
> 
>       Idle state:
> 
>          Initially BGP is in the Idle state.

Not BGP, but the peer FSM, right?

> 
>          In this state BGP refuses all incoming BGP connections.  No

all incoming connections from that peer?

> 
>          resources are allocated to the peer. In response to a
>          manual start event(Event1) or an automatic start
>          event(Event3), the local system:
>             - initializes all BGP resources,
all BGP resources or only those needed for the peer?
also, what does 'initialize' mean here?

>             - sets ConnectRetryCnt (the connect retry counter) to zero

Seems we have inconsistency in FSM parameter naming here.

>         In response to a manual start event with the passive TCP connection
>         flag (Event 4) or automatic start with the passive TCP connection
>         flag (Event 5), the local system:
>             - initializes all BGP resources,
>             - sets ConnectRetryCnt (the connect retry counter) to zero,
>             - starts the Connect Retry timer with initial value,
>             - listens for a connection that may be initiated by
>               the remote peer, and
>             - changes its state to Active.

Ditto comments here

>         The method of preventing persistent peer oscillation is
>         outside the scope of this document.

So we have these events, but we don't define how to handle them?

>         Any other events [Events 9-12, 15-28] received in the Idle state
>         does not cause change in the state of the local system.

'do not cause changes'  ?

>         In response to a manual stop event [Event2], the local system:
>            - drops the TCP connection,
>            - releases all BGP resources,
>            - sets ConnectRetryCnt (the connect retry count) to zero
>            - sets the Connect Retry timer to zero, and

sets timer to zero? 'Stops the timer' instead?

>            - changes its state to Idle.
...

>         If the BGP port receives a valid TCP connection indication
BGP port?
>         [Event 14], the TCP connection is processed and
>         the connection remains in the Connect state.
> 
>         If the TCP connection receives an invalid indication [Event 15]:

TCP connection receives?

>         the local system rejects the TCP connection and the connection
>         remains in the Connect state.
> 
>         If the TCP connection succeeds [Event 16 or Event 17],
>         the local system checks the Delay Open flag prior to
>         processing. If the Delay Open flag is set, the local system:
>              - sets the Connect Retry timer to zero,
"stops" instead?

>              - set the Open Delay timer to the initial value, and

sets

>              - stays in the Connect state.
>         If the Delay Open flag is not set, the local system:
>              - sets the Connect Retry timer to zero,
stops

>              - completes BGP initialization

What does the above really mean?

...
>         the Open Delay Timer. If the Open Delay timer is running,
>         the local system:
>             - restarts the connect retry time with initial value,
>             - stops the Open Delay timer and resets value to zero,
>             - continues to listen for a connection that may be
>               initiated by the remote BGP peer, and
>             - changes its state to Active.
>         If the open Delay timer is not running, the local system:
>            - sets the Connect Retry timer to zero,
>            - drops the TCP connection,
>            - releases all BGP resources, and
all BGP resources?

>            - changes its state to Idle.
> 
>         If an OPEN message is received with the Open Delay timer is
>         running [Event 20], the local system:
>            - sets the Connect Retry timer to zero,
>            - completes the BGP initialization,
What does it mean?

>            - stops and clears the Open Delay timer (sets the value to zero),
>            - sends an OPEN message,
>            - sends a KEEPALIVE message,
>            - If the hold timer value is non-zero,
>                    - start the keepalive timer to inital value,
"starts"... start to initial value?

>                    - reset the hold timer to the negotiated value,
Resets

>              else if hold timer value is zero,
>                    - reset the keepalive timer, and

resets

>                    - reset the hold timer value to zero
resets

>            - and changes its state to OpenConfirm.
> 

OK, I'll stop reviewing the FSM text here and will skip to the next
section. Given the number of English grammar mistakes, it is clear to
me that either it has not been sufficiently reviewed or even read by
someone carefully enough or the comments have not been incorporated.
Please address.

...
> 9. UPDATE Message Handling
> 
> 
>    An UPDATE message may be received only in the Established state.
What if it is received in another state?

...
> 9.1 Decision Process
> 
> 
>    The Decision Process selects routes for subsequent advertisement by
>    applying the policies in the local Policy Information Base (PIB) to
>    the routes stored in its Adj-RIBs-In. The output of the Decision Pro-
>    cess is the set of routes that will be advertised to peers; the
>    selected routes will be stored in the local speaker's Adj-RIB-Out
RIB-Out or RIBs-out (plural)?

>    according to policy.
> 
>    The selection process is formalized by defining a function that takes
>    the attribute of a given route as an argument and returns either (a)
>    a non-negative integer denoting the degree of preference for the
>    route, or (b) a value denoting that this route is ineligible to be
>    installed in LocRib and will be excluded from the next phase of route

Loc-RIB
>    selection.
...
>    The Decision Process operates on routes contained in the Adj-RIB-In,
Adj-RIBs-In (plural) ?
>    and is responsible for:

> 9.1.1 Phase 1: Calculation of Degree of Preference
...
>       If the route is learned from an external peer, then the local BGP
>       speaker computes the degree of preference based on preconfigured
>       policy information. If the return value indicates that the route
>       is ineligible, the route MAY NOT serve as an input to the next
>       phase of route selection; otherwise the return value is used as
>       the LOCAL_PREF value in any IBGP readvertisement.

So, AFAIK, the major implementations do not follow this step
(calculating the degree of preference, and then announcing). Instead,
implementations allow setting the LOCAL_PREF value locally, which is
taken into consideration during the best path selection, and is also
reannounced further.

Also "is used" is not specific enough. Is it SHOULD or MUST?

> 9.1.2 Phase 2: Route Selection
...
>    If the AS_PATH attribute of a BGP route contains an AS loop, the BGP
>    route should be excluded from the Phase 2 decision function.  AS loop
>    detection is done by scanning the full AS path (as specified in the
>    AS_PATH attribute), and checking that the autonomous system number of
>    the local system does not appear in the AS path.  Operations of a BGP
>    speaker that is configured to accept routes with its own autonomous
>    system number in the AS path are outside the scope of this document.

If we're checking for an AS loop here (in Phase 2) as opposed to
during the UPDATE message sanity checking, the route is already
received and accepted in the peer's Adj-RIB-In. Those implementations
I know don't even install such routes in the RIB...

> 9.1.2.2 Breaking Ties (Phase 2)
...
>       Similarly, neighborAS(n) is a function which returns the neighbor
>       AS from which the route was received.  If the route is learned via
>       IBGP, and the other IBGP speaker didn't originate the route, it is
>       the neighbor AS from which the other IBGP speaker learned the
>       route. If the route is learned via IBGP, and the other IBGP
>       speaker originated the route, it is the local AS.

What if the route is locally originated?

> 9.1.4 Overlapping Routes
...
>    When overlapping routes are present in the same Adj-RIB-In, the more
>    specific route takes precedence, in order from more specific to least
>    specific.
> 
Doesn't this happen at the packet forwarding stage?

> 
>    The set of destinations described by the overlap represents a portion
>    of the less specific route that is feasible, but is not currently in
>    use.  If a more specific route is later withdrawn, the set of desti-
>    nations described by the overlap will still be reachable using the
>    less specific route.
> 
>    If a BGP speaker receives overlapping routes, the Decision Process
>    MUST consider both routes based on the configured acceptance policy.
>    If both a less and a more specific route are accepted, then the Deci-
>    sion Process MUST either install both the less and the more specific
Install where?

>    routes or it MUST aggregate the two routes and install the aggregated
>    route, provided that both routes have the same value of the NEXT_HOP
>    attribute.

anyone really does the latter?

>    If a BGP speaker chooses to aggregate, then it SHOULD either include
>    all AS used to form the aggreagate in an AS_SET or add the
>    ATOMIC_AGGREGATE attribute to the route.  This attribute is now pri-
>    marily informational.  With the elimination of IP routing protocols
>    that do not support classless routing and the elimination of router
>    and host implementations that do not support classless routing, there
>    is no longer a need to deaggregate.  Routes SHOULD NOT be de-aggre-
>    gated.  A route that carries ATOMIC_AGGREGATE attribute in particular
>    MUST NOT be de-aggregated. That is, the NLRI of this route can not be
>    made more specific. Forwarding along such a route does not guarantee
>    that IP packets will actually traverse only ASs listed in the AS_PATH
>    attribute of the route.

Since we don't do deaggregation any more, should we remove the
discussion about it completely and indicate in the "changes" section
that deaggregation has been deprecated?

> 9.2 Update-Send Process
...
>    When a BGP speaker receives an UPDATE message from an internal peer,
>    the receiving BGP speaker SHALL NOT re-distribute the routing infor-
>    mation contained in that UPDATE message to other internal peers,
>    unless the speaker acts as a BGP Route Reflector [RFC2796].

Suggest to put "unless..." in brackets () to make it more apparent
that this is not a normative ref.

> 9.2.1.1 Frequency of Route Advertisement
>    Since fast convergence is needed within an autonomous system, either
>    (a) the MinRouteAdvertisementInterval used for internal peers SHOULD
>    be shorter than the MinRouteAdvertisementInterval used for external
>    peers, or (b) the procedure describe in this section SHOULD NOT apply
>    for routes sent to internal peers.

It sounded like MinRouteAdvertisementInterval was an architectural
constant, but now it sounds like either this is a timer that can be
assigned different settings or there are two constants:
MinRouteAdvIntIBGP and MinRouteAdvIntEBGP.

> 9.2.2.2 Aggregating Routing Information
> 

Hmmm... I expected to see in this section some text talking about when
and how an aggregate would be announced, i.e., when an aggregate
prefix is configured, and more specific routes are present, the
aggregate is announced, when no specifics are left--withdraw the
aggregate. I haven't found anything on this topic...


> 9.3 Route Selection Criteria
>
>    Generally speaking, additional rules for comparing routes among sev-
>    eral alternatives are outside the scope of this document. There are
>    two exceptions:
> 
>       - If the local AS appears in the AS path of the new route being
>       considered, then that new route can not be viewed as better than
>       any other route (provided that the speaker is configured to accept
>       such routes). If such a route were ever used, a routing loop could
>       result.
> 
>       - In order to achieve successful distributed operation, only
>       routes with a likelihood of stability can be chosen. Thus, an AS
>       SHOULD avoid using unstable routes, and it SHOULD NOT make rapid
>       spontaneous changes to its choice of route. Quantifying the terms
>       "unstable" and "rapid" in the previous sentence will require expe-
>       rience, but the principle is clear.
> 

Where does this (the second one) fit within and how does this affect
the route selection criteria?

>    Care must be taken to ensure that BGP speakers in the same AS do not
>    make inconsistent decisions.

How? What does this mean for the implementor?

> 9.4 Originating BGP routes
> 
>    A BGP speaker may originate BGP routes by injecting routing informa-
>    tion acquired by some other means (e.g. via an IGP) into BGP. A BGP
>    speaker that originates BGP routes assigns the degree of preference
> 

"assigns the degree of preference"... how do the implementations
really do that?

> 10 BGP Timers
...
>    The suggested default value for the MinRouteAdvertisementInterval is
>    30 seconds.

This was described as a parameter, not a timer. Further, it was
earlier suggested that it should be shorter for iBGP than it is for
eBGP. I'd expect the document to specify the recommended value for
both.

> IANA Considerations
...
>    All extensions to this protocol, including new message types and Path
>    Attributes MUST only be made using the Standards Action process
>    defined in [RFC2434].

This section should include the description of each registry that
needs to be created (if needed) and maintained by IANA, as well as the
allocation policy that is in the text already.

<EOM>



Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id QAA25105 for <idr-archive@nic.merit.edu>; Mon, 5 May 2003 16:09:42 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 403CA9123E; Mon,  5 May 2003 16:09:10 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 160439123F; Mon,  5 May 2003 16:09:10 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id D579C9123E for <idr@trapdoor.merit.edu>; Mon,  5 May 2003 16:09:08 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id C32DE5E23E; Mon,  5 May 2003 16:09:08 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from workhorse.fictitious.org (workhorse.fictitious.org [209.150.1.230]) by segue.merit.edu (Postfix) with ESMTP id 0C5FF5E23D for <idr@merit.edu>; Mon,  5 May 2003 16:09:08 -0400 (EDT)
Received: from workhorse.fictitious.org (localhost.fictitious.org [127.0.0.1]) by workhorse.fictitious.org (8.9.3/8.9.3) with ESMTP id QAA04798; Mon, 5 May 2003 16:09:20 -0400 (EDT) (envelope-from curtis@workhorse.fictitious.org)
Message-Id: <200305052009.QAA04798@workhorse.fictitious.org>
To: Jeffrey Haas <jhaas@nexthop.com>
Cc: Yakov Rekhter <yakov@juniper.net>, idr@merit.edu
Reply-To: curtis@fictitious.org
Subject: Re: Issue 19) Security Considerations 
In-reply-to: Your message of "Mon, 05 May 2003 14:45:13 EDT." <20030505144513.B17555@nexthop.com> 
Date: Mon, 05 May 2003 16:09:20 -0400
From: Curtis Villamizar <curtis@fictitious.org>
Sender: owner-idr@merit.edu
Precedence: bulk

In message <20030505144513.B17555@nexthop.com>, Jeffrey Haas writes:
> Yakov,
> 
> On Mon, May 05, 2003 at 09:59:14AM -0700, Yakov Rekhter wrote:
> > I don't recall seeing any objections to adding this to the document.
> 
> It was more that I hadn't heard anything from anyone one way or the
> other. 
> 
> > Yakov.
> 
> -- 
> Jeff Haas 
> NextHop Technologies


Neither did I, in case there was any question about whether something
was being worked out off list.

Curtis



Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id OAA22918 for <idr-archive@nic.merit.edu>; Mon, 5 May 2003 14:46:46 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id A063491239; Mon,  5 May 2003 14:45:45 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 7021F9123B; Mon,  5 May 2003 14:45:45 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 3F87F91239 for <idr@trapdoor.merit.edu>; Mon,  5 May 2003 14:45:44 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 9220F5E1E9; Mon,  5 May 2003 14:45:40 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from presque.nexthop.com (dns.nexthop.com [65.247.36.216]) by segue.merit.edu (Postfix) with ESMTP id 65EA25E1DA for <idr@merit.edu>; Mon,  5 May 2003 14:45:37 -0400 (EDT)
Received: (from root@localhost) by presque.nexthop.com (8.12.8/8.11.1) id h45IjZh5075505; Mon, 5 May 2003 14:45:35 -0400 (EDT) (envelope-from jhaas@jhaas.nexthop.com)
Received: from jhaas.nexthop.com (jhaas.nexthop.com [65.247.36.31]) by presque.nexthop.com (8.12.8/8.12.8) with ESMTP id h45IjIWB075430; Mon, 5 May 2003 14:45:18 -0400 (EDT) (envelope-from jhaas@jhaas.nexthop.com)
Received: (from jhaas@localhost) by jhaas.nexthop.com (8.11.3nb1/8.11.3) id h45IjDt21595; Mon, 5 May 2003 14:45:13 -0400 (EDT)
Date: Mon, 5 May 2003 14:45:13 -0400
From: Jeffrey Haas <jhaas@nexthop.com>
To: Yakov Rekhter <yakov@juniper.net>
Cc: idr@merit.edu
Subject: Re: Issue 19) Security Considerations
Message-ID: <20030505144513.B17555@nexthop.com>
References: <20030430124022.K24007@nexthop.com> <200305051659.h45GxEu26987@merlot.juniper.net>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
User-Agent: Mutt/1.2.5i
In-Reply-To: <200305051659.h45GxEu26987@merlot.juniper.net>; from yakov@juniper.net on Mon, May 05, 2003 at 09:59:14AM -0700
X-Virus-Scanned: by AMaViS perl-11
Sender: owner-idr@merit.edu
Precedence: bulk

Yakov,

On Mon, May 05, 2003 at 09:59:14AM -0700, Yakov Rekhter wrote:
> I don't recall seeing any objections to adding this to the document.

It was more that I hadn't heard anything from anyone one way or the
other. 

> Yakov.

-- 
Jeff Haas 
NextHop Technologies


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id MAA19898 for <idr-archive@nic.merit.edu>; Mon, 5 May 2003 12:59:50 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id EE03491230; Mon,  5 May 2003 12:59:22 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id B7AC691231; Mon,  5 May 2003 12:59:22 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 8307C91230 for <idr@trapdoor.merit.edu>; Mon,  5 May 2003 12:59:21 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 6100B5E185; Mon,  5 May 2003 12:59:21 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from merlot.juniper.net (natint.juniper.net [207.17.136.129]) by segue.merit.edu (Postfix) with ESMTP id DAAFA5E17B for <idr@merit.edu>; Mon,  5 May 2003 12:59:20 -0400 (EDT)
Received: from juniper.net (garnet.juniper.net [172.17.28.17]) by merlot.juniper.net (8.11.3/8.11.3) with ESMTP id h45GxEu26987; Mon, 5 May 2003 09:59:14 -0700 (PDT) (envelope-from yakov@juniper.net)
Message-Id: <200305051659.h45GxEu26987@merlot.juniper.net>
To: Jeffrey Haas <jhaas@nexthop.com>
Cc: idr@merit.edu
Subject: Re: Issue 19) Security Considerations 
In-Reply-To: Your message of "Wed, 30 Apr 2003 12:40:23 EDT." <20030430124022.K24007@nexthop.com> 
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-ID: <31248.1052153954.1@juniper.net>
Date: Mon, 05 May 2003 09:59:14 -0700
From: Yakov Rekhter <yakov@juniper.net>
Sender: owner-idr@merit.edu
Precedence: bulk

Jeff,

> Curtis wrote in a previous message:
> 
> On Thu, Apr 03, 2003 at 09:36:06AM -0500, Curtis Villamizar wrote:
> > --- draft-murphy-bgp-vuln-02.txt	Wed Mar  5 21:00:00 2003
> > +++ draft-murphy-bgp-vuln-02.txt++	Thu Apr  3 09:18:12 2003
> > @@ -149,6 +149,7 @@
> >  3.2.2.2 Timer events ..............................................   16
> >  4 Security Considerations .........................................   16
> >  4.1 Residual Risk .................................................   16
> > +4.2 Practical Considerations ......................................   16
> >  5 References ......................................................   17
> >  6 Author's Address ................................................   18
> >  
> > @@ -901,6 +902,79 @@
> >  Filtering is in use near some customer attachment points, but is not
> >  effective near the Internet center.  The other mechanisms are still
> >  controversial and are not yet in common use.
> > +
> > +4.2 Practical Considerations
> [...]
> 
> This looks like it has good merit.  Shouldn't we add this to the document?
> (Well, not "we" since Sandy is authoring it, but it seems like a good idea.)

I don't recall seeing any objections to adding this to the document.

Yakov.


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id JAA13154 for <idr-archive@nic.merit.edu>; Mon, 5 May 2003 09:22:43 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 03B0991229; Mon,  5 May 2003 09:22:22 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id BFEB69122A; Mon,  5 May 2003 09:22:21 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 78F6491229 for <idr@trapdoor.merit.edu>; Mon,  5 May 2003 09:22:20 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 5B4FA5E129; Mon,  5 May 2003 09:22:20 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from merlot.juniper.net (natint.juniper.net [207.17.136.129]) by segue.merit.edu (Postfix) with ESMTP id D5A7D5E10E for <idr@merit.edu>; Mon,  5 May 2003 09:22:19 -0400 (EDT)
Received: from juniper.net (garnet.juniper.net [172.17.28.17]) by merlot.juniper.net (8.11.3/8.11.3) with ESMTP id h45DMDu06347; Mon, 5 May 2003 06:22:13 -0700 (PDT) (envelope-from yakov@juniper.net)
Message-Id: <200305051322.h45DMDu06347@merlot.juniper.net>
To: Shankar Vemulapalli <svemulap@cisco.com>
Cc: idr@merit.edu
Subject: Re: draft-ietf-idr-bgp4-20.txt 
In-Reply-To: Your message of "Sat, 03 May 2003 15:06:46 PDT." <Pine.GSO.4.53.0305031457570.3207@sj-cse-138.cisco.com> 
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-ID: <60260.1052140933.1@juniper.net>
Date: Mon, 05 May 2003 06:22:13 -0700
From: Yakov Rekhter <yakov@juniper.net>
Sender: owner-idr@merit.edu
Precedence: bulk

Shankar,

> Hi -
> 
> Not sure if this is already pointed out -
> 
> In draft-ietf-idr-bgp4-20.txt - page  14
>          [RFC2842] defines the Capabilities Optional Parameter
> and on page 85 -
>    [RFC2842] R. Chandra, J. Scudder, "Capabilities Advertisement with
>    BGP-4", RFC2842.
> 
> should be changed to newer RFC  - RFC3392 - to reflect the latest info.

Agreed.

Yakov.


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id SAA05271 for <idr-archive@nic.merit.edu>; Sat, 3 May 2003 18:07:42 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id CF16991330; Sat,  3 May 2003 18:07:06 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 963EC9132E; Sat,  3 May 2003 18:07:06 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id E74D591326 for <idr@trapdoor.merit.edu>; Sat,  3 May 2003 18:06:59 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 906F35E586; Sat,  3 May 2003 18:06:59 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from fire.cisco.com (firebird.cisco.com [171.68.227.73]) by segue.merit.edu (Postfix) with ESMTP id 474235E4AC for <idr@merit.edu>; Sat,  3 May 2003 18:06:47 -0400 (EDT)
Received: from sj-cse-138.cisco.com (sj-cse-138.cisco.com [171.69.98.126]) by fire.cisco.com (8.11.6+Sun/8.8.8) with ESMTP id h43M6k003556 for <idr@merit.edu>; Sat, 3 May 2003 15:06:46 -0700 (PDT)
Date: Sat, 3 May 2003 15:06:46 -0700 (PDT)
From: Shankar Vemulapalli <svemulap@cisco.com>
To: idr@merit.edu
Subject: draft-ietf-idr-bgp4-20.txt
Message-ID: <Pine.GSO.4.53.0305031457570.3207@sj-cse-138.cisco.com>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII
Sender: owner-idr@merit.edu
Precedence: bulk

Hi -

Not sure if this is already pointed out -

In draft-ietf-idr-bgp4-20.txt - page  14
         [RFC2842] defines the Capabilities Optional Parameter
and on page 85 -
   [RFC2842] R. Chandra, J. Scudder, "Capabilities Advertisement with
   BGP-4", RFC2842.

should be changed to newer RFC  - RFC3392 - to reflect the latest info.

Thanks,

/Shankar


Received: from trapdoor.merit.edu (postfix@trapdoor.merit.edu [198.108.1.26]) by nic.merit.edu (8.9.3/8.9.1) with ESMTP id EAA12707 for <idr-archive@nic.merit.edu>; Sat, 3 May 2003 04:06:02 -0400 (EDT)
Received: by trapdoor.merit.edu (Postfix) id 4DAC991316; Sat,  3 May 2003 04:05:45 -0400 (EDT)
Delivered-To: idr-outgoing@trapdoor.merit.edu
Received: by trapdoor.merit.edu (Postfix, from userid 56) id 0272C91317; Sat,  3 May 2003 04:05:44 -0400 (EDT)
Delivered-To: idr@trapdoor.merit.edu
Received: from segue.merit.edu (segue.merit.edu [198.108.1.41]) by trapdoor.merit.edu (Postfix) with ESMTP id 50D0E91316 for <idr@trapdoor.merit.edu>; Sat,  3 May 2003 04:05:43 -0400 (EDT)
Received: by segue.merit.edu (Postfix) id 3D1C85E398; Sat,  3 May 2003 04:05:43 -0400 (EDT)
Delivered-To: idr@merit.edu
Received: from merlot.juniper.net (natint.juniper.net [207.17.136.129]) by segue.merit.edu (Postfix) with ESMTP id A80775DFE5 for <idr@merit.edu>; Sat,  3 May 2003 04:05:42 -0400 (EDT)
Received: from roque-bsd.juniper.net (roque-bsd.juniper.net [172.17.12.183]) by merlot.juniper.net (8.11.3/8.11.3) with ESMTP id h4385Ku05753; Sat, 3 May 2003 01:05:20 -0700 (PDT) (envelope-from roque@juniper.net)
Received: (from roque@localhost) by roque-bsd.juniper.net (8.11.6/8.9.3) id h4385Kd51107; Sat, 3 May 2003 01:05:20 -0700 (PDT) (envelope-from roque)
Date: Sat, 3 May 2003 01:05:20 -0700 (PDT)
Message-Id: <200305030805.h4385Kd51107@roque-bsd.juniper.net>
From: Pedro Roque Marques <roque@juniper.net>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
To: Alex Zinin <zinin@psg.com>
Cc: ppvpn@nortelnetworks.com, idr@merit.edu
Subject: On BGP and VPLS
In-Reply-To: <51133594448.20030502191439@psg.com>
References: <51133594448.20030502191439@psg.com>
X-Mailer: VM 6.34 under 19.16 "Lille" XEmacs Lucid
Sender: owner-idr@merit.edu
Precedence: bulk

[crossposted to idr WG mailing list]

Alex Zinin writes:


>  More specifically, below I tried to put together a list of concerns
> I have about the approach described in draft-kompella-ppvpn-vpls,
> that I would like the WG to consider.

>  1. Use of the NLRI field

>    As an IP routing protocol, BGP uses the NLRI field to carry IP
> reachability information in the form of IP prefixes. Prefixes within
> the NLRI field are used for two main purposes in BGP: a) as the
> destination/mask pair in the routes installed by BGP in the routing
> table, and b) as a handle to an entry in the BGP RIBs.

>    The document in subject changes the semantics of the NLRI field
> quite substantially even when compared to 2547. First, all of its IP
> prefix-related properties are lost. There is no more IP routing, or
> any addressing information in it. Second, the notion of TLVs is
> introduced inside this field, which a) is not needed in BGP as an IP
> routing protocol, and b) because of its variable length property
> changes the nature of the NLRI contents, i.e., it's being a stable
> handle in the BGP database. To solve these problems the
> implementations would need to use only a part of the contents of the
> NLRI field as the handle used to index within the RIBs, and store
> the rest as attributes.

Alex,
This point seems to be predicated in the statement that "BGP uses the
NLRI field to carry IP reachability"...

It opens up a sort of philosophical discussion on BGP. This is of
course a highly subjective topic which is hard to quantify or to prove
by logical terms.

Allow me to present my personal view.

BGP is a particular implementation of an algorithm that performs non
looping database flooding distribution. That algorithm consists mostly
on the path vector (used both in ebgp and route reflection) plus route
advertisement rules. This is the publicly specified part of the beast.

However that ends up being about 10% of the database exchange
algorithm. Each implementation uses distinct algorithms to do the real
heavy lifting: the advertisement of database updates to its peers,
given that each peer is allowed to flow control and that the ammount
of information to be distributed is typically non-trivial compared to
the resources of the system.

None of the functions above actually do depend on the format of your
database records. As long as there is a primary key associated with
each record. Modern implementations, given that they are required to
handle 3/4 different types of records w/ different keys (ipv4, ipv6,
2547, 2547-for-ipv6, etc) will tend to treat these keys just as
database systems do: as a bit string without any semantics associated
w/ it.

Note also that the number of distinct tables exchanged in a 2547
implementation may be in the thousands. So segregation of which record
belongs to which table is necessarily a solved problem in practice.

There is one part of BGP that however interacts w/ the semantics of
the particular database being exchanged: route selection from the
Loc-RIB.

The Loc-RIB is by definition where BGP interacts w/ remaining users of
the database and it includes rules that are system specific.

As an exercise, if we take the existing spec and do:
s/route/record/g
s/IP prefix/key/g

Do we still have a document that makes sense.. ?

Except for the vague bits about aggregation, about which BGP itself
does little about, i would contend that the result would be pretty
much the same.

2547 which you cite is a particular good example, imho. A 2547 NLRI
ends up being used to create IP reachability information, but while it
is a safi 128 record, it is not IP reachability and it is not treated
as such.

>  2. Distribution of information

>    When used as an IP routing protocol, BGP distributes routes among
> all participating routers. Each router (PE or P using VPN
> terminology) is interested in _all_ routes received from its peers;
> it selects the best path for each prefix if multiple are available
> and installs it in it's routing table; the best paths are propagated
> further to other peers.

That is not the case w/ 2547. PE routers typically have interest in
only a subset of the routing information. They tend to do inbound
filtering in current network deployements but one can also do outbound
filtering in the RRs via either extended-community ORF or subsequent
improvements to ORF (draft-marques-ppvpn-rt-contrain).

P routers do not carry 2547 routing information.

>    The way BGP is used in the document results in a situation where
> information relevant only to a subset of routers (e.g. PW-specific,
> or VPLS-specific info) is sent to all PEs participating in the BGP
> domain. More than that, P routers, usually used as route reflectors
> in IP routing, end up storing all information while they are not
> using any of it locally.

RR in VPN deployments are typically not in the topology. My
understanding of the P-router term is that it is a transit node that
does not have VPN information.

>    Note also, that best path selection that is normally performed by
> BGP when it receives information about the same prefix from multiple
> peers, is not needed in the VPLS case, and (even if implementations
> decided to apply the same algo as in regular BGP) would just be an
> artifact.

Not really... i can advertise the same key from multiple sources in
L2VPNs also. All policy mechanisms do work... igp distance, etc. It is
just the semantics once the path is selected that are different.

As an example think working and protect PE for a given emulated
circuit (or lan).

>    The above exposes the difference between the routing nature of
> BGP when used for IP (where reachability info is distributed and the
> path properties are as important as the info itself), and its purely
> transport application in the proposal (where only the fact of
> information delivery is important.)

>    Interestingly enough, from the transport perspective, BGP, though
> reduces the number of sessions a given PE has to maintain (and thus
> the sender's complexity), introduces additional overhead from the
> receiver's perspective--if a PE router has multiple BGP sessions
> (which is normally the case), it will receive the same information
> more than once, while clearly a single copy is enough.

I don't know which model you have in mind but in a typical VPN
deployment scenario (l3 or l2/vpls/etc) a PE has 2 peering sessions to
a RR outside of the topology. The second copy of the information is
there for redudancy...

If a full mesh where used, only 1 copy would be present.

>  3. Aggregation of information for large-scale operation

>    When distributing information among a large number of systems, it
> is important to be able to aggregate information as it travels
> further ahead to ensure scalability of the system. In routing this
> is achieved by summarizing a set of prefixes and announcing them as
> a less specific prefix. For example, AS'es in the Internet do not
> exchange granular IP prefixes visible inside IGPs, but instead send
> each other aggregate prefixes via BGP.

>    It is not clear to me how, given the format of the NLRI field,
> VPLS information can be aggregated using the proposal in the
> document.

To give you an example, in JunOS aggregation is implemented as a
separate routing protocol... if i'm not mistaken the model is lifted
from 'gated'. Clearly the idea that aggregation may be a distinct
component from BGP has been around for a while.

VPLS doesn't really need aggregation although it does use an IGP :-)
PE to PE connectivity is performed indepently from the 'forwarding
distinguisher' advertisement (i.e. the inner label). Any or multiple
routing and singaling protocols may be used for this
functionality. Only the information exterior to the SP network
(service attachements) is carried through BGP.

>  The above gives me a very uncomfortable feeling that the proposal
> is stretching BGP to perform functions it was not designed for.

Any succesful protocol will be used for means other than it was
designed for. That is usually a sign that the designers got something
right.

Let me give you an example: BGP is currently used to block spam
propaggating networks/hosts. What this an original goal of the design
? Hardly. When used to block spam BGP does not advertise any valid
forwarding information for instance. And i'm sure it is a question of
time until we add port information to the record keys.

>  Below are some additional points that should be taken into
> consideration.
   
>  4. Backwards compatibility and SW upgrade requirements

>    Because the proposal suggests using a new AFI/SAFI combination,
> PE routers will not be able to announce VPLS information using the
> existing BGP infrastructure. All BGP speakers in a SP's network,
> including the P routers, will have to be upgraded with new SW,
> though information needs to be exchanged only among the PEs.

That is not an issue as we've seen above. The deployment model is
different from what you assume.

>  5. Coupling of VPLS and BGP SW

>    Putting VPLS-related functions in BGP leads to two unwanted
> consequences:

>     a) Lesser BGP code stability--bugs in the VPLS part of the code
> will likely affect parts of BGP used for Internet routing, thus
> increasing the chances of BGP failures in SP networks.  The same
> argument works in the opposite direction.

You have no basis to conclude that.

Any modern BGP implementation worth its salt consists of
AF-independent code + AF-specific code. The fact is that you can
implement VPLS without touching the AF-independent code.

Any line of code change that you make to an implementation as the
potential to introduce bugs... 

>     b) Potential dynamic effects--since with a BGP-based approach,
> VPLS- and routing-related processes are likely to share the same
> internal router resources (such as timers, threads, locks/mutex'es,
> queues, memory pools), dynamics of the VPLS system are likely to
> influence the dynamics of the routing- related functions and vice
> versa. The larger the overlap between the two systems, the higher
> are the chances of such interference.

I'm sorry but this is just FUD.
All router implementations do have some level of resource sharing
between completly unrelated features. In some of them, all
functionality shares all resources.

Don't want BGP sharing timers w/ X.25-over-TCP... disable one of them.

>  My recommendation would be for the WG to consider these points.

The way i see it there is an high likely-hood of this turning into an
"Yes, it is" "No, it isn't" discussion. And I'd really like to avoid
that.

A question to you and to the WG(s) in general:

- What are the main concerns that you have w/ the generic database
exchange view of BGP (Lets call it the "Basically General Purpose"
theory).

- Can we have a reasonable discussion about the best engineering
approach to provide database exchange services for
routing-related-applications without getting into a religious argument
about "2547 is evil" ? i.e. can we try to separate how highly each one
of us rates the actual application from this discussion ?

- I believe one of the preconditions for a resonable discussion is to
realise that implementors are the most interested people in not
introducing regressions to shipping code. They actually get to fix it
after being screamed at for a considerable lenght of time.
I'd really like to get past the "you can't implement a feature i don't
want because your are going to break the code" kind of discussion.

- Are we going to have a similiar discussion about LDP ? LDP is not
any less relevant for network stability nor a protocol which is any
simpler than BGP (if anything the level of complexity is higher given
that LDP has all the db exchange problem of BGP + a non trivial
ammount of issues of its own).

regards,
  Pedro.