Re: [GROW] I-D Action:draft-ietf-grow-diverse-bgp-path-dist-01.txt

Hi Robert, all

I share some of Jim's comments. To me the draft tries to cover too many
cases. What about a (additional ?) "simple"-diverse-bgp-path draft which
states that:
- add path is the long term solution
- in the short term, while all routers are not add path compliant, a
(possibly add path compliant) router could advertise multiple paths to
its peer by using multiple parallel standard BGP sessions.
- add path applications (specifications & relative tradeoff) can apply
as-is to diverse-simple.

That deployment model is already covered in the draft. But IMHO the
addition of  multiple (more complex) ones add unnecessary complexity
(given that a typical reader may be mainly interested in a single one).

Besides, that simplified framework would ensure that
"simple"-diverse-path applications (spec and issues) would be aligned
with add-path ones without additional work. This also removes the need
for independent RR in different planes to make coherent routing
decisions, which may be a source of discussions and complexity.

There are still points to be studied such as the correct handling of
parallel BGP sessions. e.g. what if session 0 -which advertise the best
path- goes down? To be safe, I would call for the shutdown (possibly
graceful shutdown :-) ) of the other // sessions

Thanks,
Best regards,
Bruno

> From: Robert Raszuk
> 
> Hi Jim,
> 
> > The approach as I understand it is two deploy multiple channels to
> > disseminate routing state be it the 2,3,...nth path to some dest D..
> 
> Not necessarily at all. Multiple channels is just one of possible
> deployment models.
> 
> Primary goal is to observe that while today having a pair of best path
> RRs one could easily turn one of the reflector within such pair into a
> backup RR, and without any need for any new IBGP session to be
> provisioned or without any need to add new RRs disseminate backup path
> to all clients.
> 
> This provides very easy to deploy mechanism where without any need for
> PE upgrade you provide PEs additional paths for fast connectivity
> restoration, PIC or load balancing needs.
> 
> > Comments follow...
> 
> So do further replies ...
> 
> > Jim Uttaro
> >
> > Section 1.0
> >
> > "The parallel route reflector planes solution brings very
significant
> > benefits at a negligible capex and opex deployment price as compared
to
> > the alternative techniques"
> >
> > A number of points need to be clarified here. The first is the
> > SP/Operator needs to deploy n number of RR planes to disseminate N
> > paths. Assuming some form of redundancy we would have to of course
buy
> > the RRs or deploy some type of logical routers. How can this be
> > monetized?
> 
> Not required. You maintain connectivity redundancy even if turning
> existing pair of RRs into primary and backup case. No new purchase
order
> required nor need for any logical routers.
> 
> The draft describes RR planes to give the formalized description of
the
> proposal, but I was hoping that it could be easily interpreted into
> basic deployment styles.
> 
> Adding new set of RRs if someone needs to is also possible, but
> completely not necessary for the diverse path deployment.
> 
> > Does this approach assume that customers who want fast
> > restoration, load balancing, mitigation of oscillation would pay for
> > this. Or does the draft assume that the addl RRs are of such
negligible
> > capex cost that the operator would simply incur the cost.. This
model
> > does not usually sit well with the folks that write the checks.
> 
> Again .. this is not necessary at all. Contrary if you need to swap
your
> all PEs into a new ones which support alternative ways to disseminate
> more then one BGP path .. this is where the check would be rather
heavy :).
> 
> Comparing that with just code upgrade on the RRs seems clear where the
> savings are.
> 
> > From an
> > opex perspective we are putting in addl planes for each AS that is
under
> > the operators authority. So we not only need to pay for it we would
need
> > to establish coherent inter-AS strategies to manage, maintain these
addl
> > RRs.. Additionally the function of these devices is different than a
> > traditional RR which implies that OpS needs to be cognizant of the
> > difference and how they should be managed.. As described in the
draft
> > the 2,3..nth plane need not be as robust as it is not the primary
path..
> > This needs to be understood by OpS in terms of their response to
failure
> > or how to perform maintenance.. We are essentially introducing a new
> > device from these perspectives...
> 
> Again you are stuck with one particular deployment model. I take as a
> recommendation to clarify this in the next version of the draft that
> provider may simply without adding new RRs nor without touching PEs
turn
> one of the existing RRs into a diverse-path RR and disseminate diverse
> BGP paths to the clients.
> 
> All what is required on provider side is to upgrade RRs with new code
+
> enable client sessions with new knob.
> 
> > Section 2.1
> >
> > "This new requirement has its own memory and processing cost.
Suffice
> > to say that by the middle of 2009 none of the commercial BGP
> > implementation can claim to support the new add-path behaviour in
> > production code, in part because of this resource overhead."
> >
> > A bit confused by this statement.. My thoughts on this was add-paths
is
> > useful for a customer that is advertising multiple paths or at
peering
> > points. In both cases we would anticipate the use of routing policy
to
> > only select a subset of these routes. It is impractical to believe
that
> > we are going to duplicate the same state over and over again on each
> > plane.. This is not a function of the draft but how operators deploy
the
> > functionality..This functionality has been around a long time in
VPNV4
> > services and I believe it will eventually be used for IPV4
services..
> 
> At peering points I find it very common to set next hop self while
> advertising towards IBGP peers so there is practically no reason to
> advertise mutliple ebgp learned paths towards the core.
> 
> VPNv4 PEs require this day one .. many ISPs in their IPv4/IPv6 do it
by
> default as well today.
> 
> One would need to also note that this quite a common practice to
> provision external peerings on different ASBRs in order to avoid
single
> ASBR going down a disconnect of number of external peering sessions.
> 
> 
> > " The add paths protocol extensions have to be implemented by all
the
> > routers within an AS in order for the system to work correctly."
> >
> > Pls explain.. Why do you believe this? It is certainly not practical
and
> > I never envisioned a full upgrade across thousands of edges in
multiple
> > AS domains.. The approach we believe we could take is to deploy on a
> > subset of edges for some set of routes.
> 
> The question one needs to ask is what is overall goal ? As you have
> observed the primary goal SPs are after is to provide fast
connectivity
> restoration, load balancing or mitigate oscillations.
> 
> To accomplish this in MPLS networks or in IP encapsulation networks
you
> need to push additional state to all edges/PEs otherwise you are
missing
> the alternative paths where they need to be present.
> 
> Pushing more then best path with add-paths requires upgrade of PEs.
> Distributing additional paths with diverse-path proposal does not
> requires any touch to the PE.
> 
> So if you can clarify what is the point of using add-paths for "subset
> of edges for some set of routes" ?
> 
> 
> > " It is intended as a way to buy more time allowing for a smoother
and
> > gradual migration where router upgrades will be required for perhaps
> > different reasons.  It will also allow the time required where
standard
> > RP/RE memory size can easily accommodate the associated overhead
with
> > other techniques without any compromises.'
> >
> > His statement seems to conflict with the one above.. Above you state
> > that it is needed everywhere to work correctly here the statement is
we
> > can buy time to gradually migrate.. Why don't we just gradually
migrate
> > and eliminate this middle step??
> 
> The gradual migration may take 10 years. Do you want to offer inferior
> services as compared with other SPs over the next 10 years ?
> 
> In my true opinion both add-paths and diverse-path are complimentary
to
> each other. Yes they both enable operator to distribute more then
> overall best path, but they do it differently.
> 
> I imagine that RRs could be capable of supporting peers which are
> upgraded with the add-paths code as well as those clients which are
not
> yet upgraded, but still would benefit with receiving more then best
path
> only.
> 
> In fact I think that most of the current applications can be easily
> satisfied with just 2nd best path which dissemination diverse path
> proposal addresses.
> 
> 
> > Section 4
> >
> > " The proposed solution is based on the use of additional route
> > reflectors or new functionality enabled on the existing route
reflectors
> > that instead of distributing the best path for each route will
> > distribute an alternative path other then best. "
> >
> > Would like to drill down on this a bit..In the first case where addl
> > deployment of RRs are done I am assuming that these RRs would
somehow
> > prefer the second best path of the first.. How would this be done
> > customers use many different mechanisms to identify primary,
secondary,
> > etc... AS-PATH prepend, Local Pref, IGP cost etc... are all
used..How is
> > this done on the secondary plane? Regardless of either of these
> > approaches changes to the BGP implementation to select a different
POI
> > is needed. But where how do you know how a customer is identifying?
Pls
> > expand on this. It would seem that although the protocol definition
does
> > not change the operator needs to ensure that this functionality is
> > constructed the same way across all the vendors.. Will this require
> > another draft?
> 
> BGP best path algorithm is quite consistent today across vendors. That
> means that calculating best path is consistent. That also means that
> calculating 2nd best path would also be consistent.
> 
> So when this could break ?
> 
> It could break in step 9 of best path on non co-located RRs where we
> consider IGP metric to a BGP next hop.
> 
> In order to address this point there are number of options:
> 
> * Make sure from IGP point of view primary RR and backup RR are on the
> same point in the network - nothing additional is needed - that is
also
> very often the case in control plane RRs in tunneled networks
> 
> * Disable IGP metric check step on RRs - as a matter of fact RRs
making
> decision on best path from their point of view makes only sense when
RRs
> are in the data plane on the POP to core boundaries. In all other RR
> placements somewhere in the core it is really not necessary.
> 
> * No need to worry about any IGP metric step but allow backup RR to
> learn primary RR's best path and accommodate this knowledge when
> advertising diverse path towards clients. Again no need to add any new
> RR is needed nor modify even a single line of configuration of the
clients.
> 
> Any other BGP mechanism like AS-PATH prepend, Local Pref etc ... would
> be treated identically on both primary and backup RR so no issue.
> 
> 
> > " The best path (main) reflector plane distributes the best path for
> > each route as it does today.  The second plane distributes the
second
> > best path for each route and so on.  Distribution of N paths for
each
> > route can be
> > achieved by using N reflector planes."
> >
> > How is this done when it is the IGP cost that is the deciding
factor..
> > Will we have to correctly place the Nth plane corresponding to IGP
> > correctly in the IGP??
> 
> See above.
> 
> 
> > " It is easy to observe that the installation of one or more
additional
> > route reflector control planes is much cheaper and an easier than
the
> > need of upgrading 100s of routers in the entire network to support
> > different protocol encoding."
> >
> > See Above I do not believe it is all or nothing..
> 
> Also see above :) And by installation please do not think of physical
RR
> installation.  Under this I meant to indicate turning existing set of
RR
> into a backup RR plane as well.
> 
> 
> > " Diverse path route reflectors need the new ability to calculate
and
> > propagate the Nth best path instead of the overall best path.  An
> > implementation is encouraged to enable this new functionality on a
per
> > neighbor basis."
> >
> > Encouraged? I think it would be required..
> 
> I agree it is preferred and I am supporting that.
> 
> But one could observe that especially in topologies where you have
very
> good POP symmetry towards pairs of RRs or when you would prefer to add
> RR as backup that you may want to turn diverse-path functionality on
> such backup RR on a per SAFI basis.
> 
> > Section 4.1.  Co-located best and backup path RRs
> >
> > "To simplify the description let's assume that we only use two route
> > reflector planes (N=2).  When co-located the additional 2nd best
path
> > reflectors are connected to the network at the same points from the
> > perspective of the IGP as the existing best path RRs.'
> >
> > Based upon implementation this may require ports on existing core
router
> > to terminate and a costing paradigm that duplicates the original the
> > latter may be simple the former would require that there is
availability
> > at these locations.. Doesn't this also imply full symmetry? We could
not
> > deploy a subset for the nth plane and mimic the IGP decision making
of
> > the first?? The draft states that full symmetry is not needed.. Pls
> > Clarify..
> 
> As indicated above full symmetry only applies when you want to make
sure
> that IGP point of RRs is the same on primary and backup RR. As
described
> earlier this is just one of 3 ways to make sure backup RR calculates
> correct backup paths towards it's clients.
> 
> And also as described above in this example addition of second plane
may
> be as simple as upgrading one of your existing RRs and enabling it to
> distribute diverse path towards the clients.
> 
> Initially to one or few on a per session basis .. while to other
clients
> still sending duplicate of best path like today - later with more
> experience gained to more and more clients being served by this
cluster.
> 
> > " One of the deployment model of this scenario can be achieved by
simple
> > upgrade of the existing route reflectors without the need to deploy
any
> > new logical or physical platforms.  Such upgrade would allow route
> > reflectors to service both upgraded to add-paths peers as well as
those
> > peers which can not be immediately upgraded while in
> > the same time allowing to distribute more than single best path."
> >
> > The implication here is that the same primary RR would have to
"hold"
> > and disseminate multiple paths to D.. Would this create a
scalability
> > problem on this RR as it would have to hold these addl routes. Even
> > though the number of BGP routes for the internet is small in
comparison
> > to VPNV4 this should be accounted for when RR platforms are
selected.
> 
> I think for RRs platforms scalability concerns for number of routes
and
> number of sessions are no longer the issue. Talk to your favorite
vendor
> for up to date RR's scalability numbers :) But you are very correct.
> Those need to be considered when RR platforms are selected.
> 
> > Section 4.2.  Randomly located best and backup path RRs
> >
> > " The basic premise of this mode of deployment assumes that all
> > reflector planes have the same information to choose from which
includes
> > the same set of BGP paths.  It also requires the ability to skip the
> > comparison of the IGP metric to reach the bgp next hop during
best-path
> > calculation."
> >
> > Scalability concerns.We would be putting our main primary RRs at
risk.
> 
> Not sure what risk you are referring to. As indicated earlier for
> control plane RRs it is really not necessary step in the best path
since
> day one.
> 
> > Again I am confused about the IGP metric.. If the paths are equal up
to
> > the IGP metric how do decide which is primary/secondary.. The
secondary
> > RR needs to select one of the paths how does it do that??Is it
router-id
> > or something of that nature..
> 
> See above.
> 
> > "4.  Fully meshing newly added RRs' with the all other reflectors in
> > both planes.  That condition does not apply if the newly added
RR'(s)
> > already have peering to all ASBRs/PEs."
> >
> > I cannot see creating BGP sessions to all ASBR/PEs. There are BGP
> > session limits that also must be accounted for so I do not see that
as a
> > viable alternative in a large network.. So I guess we would have to
> > fully mesh to all RRs. This is similar to a full mesh of PEs in
terms of
> > getting all the routes on the secondary to make a decision.
> 
> This is normal introduction process of new RR into the network. But as
> said already few times this is optional.
> 
> > " Any of the existing routers that are not already members of the
best
> > path route reflector plane can be easily configured to serve the 2nd
> > plane either via using a logical / virtual router partition or by
local
> > implementation hooks."
> >
> > The term "Easily" is used to liberally. Getting complex
functionality
> > configured on our most important parts of the network is never easy.
It
> > requires a lot of test certification and coordination between OpS,
> > maintenance, etc... to get deployed
> 
> One needs to pick the right set of tools which he can accomplish the
> task with in the most easy way. My goal is to deliver various
deployment
> options and assist in selection of the best set of tools to complete
the
> job.
> 
> That's why I am not saying to do this one way .. Depending on network
> size, scale, complexity one may find adding a new RR as trivial
> exercise, on the other hand someone else may think of existing RR
> upgrade at the next upgrade window as pretty much free operation which
> needs to be performed anyway. Then enabling diverse path to some
clients
> and seeing how it works seems like a very smooth and gradual
deployment
> - much easier then any RR based alternatives I can think of today.
> 
> > " The additional planes of route reflectors do not need to be fully
> > redundant as the primary one does.  If we are preparing for a single
> > network failure event, a failure of a non backed up N-th best-path
route
> > reflector would not result in an connectivity outage of the actual
data
> > plane.  The reason is that this would at most affect the presence of
a
> > backup path (not an active one) on same parts of the network.  If
the
> > operator chooses to build the N-th best path plane redundantly by
> > installing not one, but two or more route reflectors serving each
> > additional plane the additional robustness will be achieved."
> >
> > Yes that may be true but we envision add-paths as being
functionality
> > that not only enables fast restoration but the ability to provide
> > customer with load balancing. Probably good to be specific about the
> > goals of the draft in the intro/abstract.
> 
> So do I. Diverse path accommodates both goals just fine.
> 
>  > Probably good to be specific about the
>  > goals of the draft in the intro/abstract.
> 
> Ack.
> 
> 
> > Section 4.3.  Multi plane route servers for Internet Exchanges
> >
> > " In such cases 100s of ISPs are interconnected on a common LAN.
Instead
> > of having 100s of direct EBGP sessions on each exchange client, a
single
> > peering is created to the transparent route server. The route server
can
> > only propagate a single best path.  Mandating the upgrade for 100s
of
> > different service providers in order to implement add-path may be
much
> > more difficult as compared to asking them for provisioning one new
EBGP
> > session to an Nth best-path route server plane."
> >
> > I do not understand. Are you saying that each eBGP session is nailed
up
> > to each plane. Are you implying that we deploy 100 planes? If not
how do
> > we know which one of the 100 ISPs should get the benefit of having
> > routes source by them propagated through the network??
> 
> No :) I am saying that if you have 100s of IX of clients by default
> route server would send only one overall best. So if you have two RS
> (and this is common for redundancy) one may send overall best and the
> other one diverse path to the IX customers. 3rd best path would be
also
> easy to achieve.
> 
> I will clarify this section.
> 
> 
> Jim - Many thx for your excellent comments and review,
> R.
> 
> _______________________________________________
> GROW mailing list
> GROW@ietf.org
> https://www.ietf.org/mailman/listinfo/grow