Re: [Idr] Regd. https://datatracker.ietf.org/doc/draft-mohanty-idr-secondary-label/

Hi Satya,

Irrespective, the RR client/non-client discussion or an option-B IAS (in
> which case none of RFC4456, RR clients/non-clients/ cluster-id etc.
>

I respectfully disagree. IBGP learned path by a BGP speaker should be sent
to other IBGP peer unless we are doing route reflection.

That is why BGP has the requirement of fully meshing the IBGP peers unless
confeds or reflection is used.

So stating that CLUSTER_ID does not apply is not proper in IDR WG. Maybe
you can say this in other forums though ...

> apply) should not detract from the main topic.  BTW, a topology like Fig.1
> (which is greatly simplified) is in production for more than 2 years now
> without any RR related issues.
>

The duration for which you run some code and which in the same time
violates a protocol spec does not matter.

Regarding PD#2, I will try to explain the issue with respect to a
> particular VPN prefix with regards to Figure 2 in the draft. Let’s say we
> are doing vanilla PIC.
>

> 1.  Local label at PE1 has primary path with next-hop ISP1 and backup
> PE2. Say this label is 100. At PE1, we cannot have the backup to ISP2
> because of the given objective constraint that traffic should be able to
> still reach ISP1 so long as there is a path from one of the PEs to ISP1. If
> we choose the backup as ISP2, and PE2-ISP1 was intact, then we would have
> defeated our purpose if we forwarded to ISP2 directly since the forwarding
> path PE1—PE2—ISP1 exists.
>

Ok.

> 2.  Local label at PE2 has primary path with next-hop ISP1 and backup PE1.
> Say this label is 100. We cannot have the backup to ISP2 because of the
> same constraint that I mentioned in (1) above.
>

Ok.

> If traffic from PE0 is ingressing at PE1 with label 100. If PE1-ISP1 link
> breaks, with vanilla PIC, traffic will be diverted to PE2 with label
> swapped to 200. At PE2, if it then finds that PE2-ISP1 is broken, it will
> send it back to PE1 after swapping label to 100, and then the micro-loop
> ensues until the BGP Convergence.
>

This is a classic case when you try to apply PIC two times to the same
packet. So in your proposal you are just trying to signal second label with
the semantics of "DO NOT PROTECT WITH PIC"

I still disagree if we should complicate protocol for a few tens of ms
instead of sending it from PE1 to ISP2 for the duration of BGP convergence.

And what happens to your schema when you have not 2 but 100 ISPs connected
to PE1 and PE2 respectively ?

Your say in respect to label 400:

   9.  When this traffic is received at PE2, if the PE2-ISP1 link is up,
       traffic will be forwarded to ISP1 on that link.  But, now if the
       PE2-ISP1 goes down, the backup path for the label 400 which
       points to the NH ISP2, is activated immediately and the traffic
       is directed to ISP2 on the PE2-ISP2 link.

How do you make this decision that backup for label 400 is eBGP peer when
during allocation when according to the BGP rules the more preferred path
is PE1 and PE1 and link to PE1 are all up all the time ?

And what if you have not two but 4 PEs each connected to 100 ISPs ?

Kind regards,
Robert

I think it may be easier to describe in better details in the next version,
> so that similar questions do not prop up.
>
>
>
> Best Regards,
>
> --Satya
>
>
>
>
>
>
>
> *From: *Igor Malyushkin <gmalyushkin@gmail.com>
> *Date: *Saturday, August 12, 2023 at 6:10 AM
> *To: *Robert Raszuk <robert@raszuk.net>
> *Cc: *Satya Mohanty (satyamoh) <satyamoh@cisco.com>, idr@ietf.org <
> idr@ietf.org>, RAMADENU, PRAVEEN <pr9637@att.com>
> *Subject: *Re: [Idr] Regd.
> https://datatracker.ietf.org/doc/draft-mohanty-idr-secondary-label/
>
> Hi Robert,
>
> Well, maybe RFC4456 indeed requires some clarification. From
> my experience, inline RRs are not the same as regular ones. Yes, they use
> the same mechanics but solve other tasks, and because they are LSRs for BGP
> LSPs or VPN LSPs some tricks with CLUSTER_IDs, peering, and label
> allocation modes are required here.
>
> I agree that the solution from PD#1 is a bad idea to solve the scaling
> issue. I don't think that there should be a new solution with the next
> layer of labels and a new path attribute whenever BGP LU is here for ages
> and solves this problem better.
>
>
>
> With Option B I would like to see which of the approaches is better (this
> one or B/C).
>
>
>
> сб, 12 авг. 2023 г. в 16:54, Robert Raszuk <robert@raszuk.net>:
>
> Hi Igor,
>
>
>
> > Using different CLUSTER_IDs for inline RRs at the same hierarchy level
> is common
>
>
>
> Even if you do setup different CLUSTER_IDs it should be fine ... as the
> other RR should not accept an UPDATE MSG when he seems his own CLUSTER_ID
> in the incoming update.
>
>
>
> Remember CLUSTER_ID should get prepended upon reflection not overwritten.
>
>
>
> Label allocation has nothing to do with loop. It is broken reflection
> configuration which causes described loops.
>
>
>
> Yes between clusters you can setup non client IBGP to fully mesh clusters,
> but within cluster it is rather a poor idea to make RRs clients of each
> other.
>
>
>
> So PD#1 is simply a misconfiguration IMHO.
>
>
>
> If you think otherwise please update RFC4456 first. Only then we could
> consider solutions to the problem caused by such update.
>
>
>
> Regards,
>
> Robert
>
>
>
>
>
> On Sat, Aug 12, 2023 at 2:39 PM Igor Malyushkin <gmalyushkin@gmail.com>
> wrote:
>
> Hello, Robert, Satya,
>
> Using different CLUSTER_IDs for inline RRs at the same hierarchy level is
> common. Especially when there is a labeled unicast underneath. Although, I
> don't understand why two RRs should be clients to each other instead of
> regular non-client peers.
>
>
> For PD#1, it is possible to signal LU addresses of PE1, PE2, and both RRs
> and use them as NHs for VPN prefixes. In this case for labeled unicast
> prefixes a per-prefix label allocation mode completely solves the problem.
> For VPN sessions RRs do not apply next-hop-self but act as classical RRs
> (or even can be unaware of any VPN sessions at all). Classical seamless
> MPLS approach. With the different CLUSTER_IDs, PIC between the RRs can be
> maintained also.
>
>
> If we talk about Option B, the solution with LU does not obviously work,
> but there are several approaches to cope with scaling problems, Option A/B,
> and Option B/C (draft-zzhang-bess-vpn-option-bc-00). The latest is the new
> draft that combines a two-labeled approach but does not require new path
> attributes.
>
> For PD#2, here I agree with Robert that it is strange to use internal BGP
> paths instead of external ones for PIC in that case. What if the ISP1 box
> goes down? All the traffic will go to the ISP2 box from both PEs anyway.
> Isn't it wise not to use internal BGP paths for a link failure? Actually,
> we don't even differentiate a link down even from a node failure. But we
> are trying to apply different FFR technics there.
>
> [Satya] Well, we do use internal paths in the best-external case. In case
> of box failure that you mention, if we can infer that, sure, there can be
> an optimization to directly send to ISP2.
> Also, for a possible loop, does not NFRR from the MNA framework solve this
> issue at the transport level?
> [Satya] Will look that up.
> My 2 cents.
>
>
>
>
>
> сб, 12 авг. 2023 г. в 15:45, Robert Raszuk <robert@raszuk.net>:
>
> Satya,
>
>
>
> *Reg PD#1: *
>
>
>
> Problem described as PD#1 arises by violation of RFC4456 rules. When your
> RRs are part of the same cluster (and here they clearly are) it is
> mandatory to use the same CLUSTER_ID on both route reflectors. That will
> prevent any reflected routes to get accepted by the other RR client.
>
>
>
>    Both these RRs are also clients of each other and advertise VPN routes to each other with the
>
>    next-hop set to the peering address.
>
>
>
> Please do not invent a bandage to heal wounds which should not be self
> made in the first place. PD#1 as described is a misconfiguration.
>
>
>
> *Reg PD#2:*
>
>
>
> You say:
>
>
>
> >  Failure scenario 2 (FS#2) The links from ISP1 to PE1 and PE2 are down
>
> >  at the same time;
>
>
>
> If those two links go down in the same time both PEs should notice it
> (optics or BFD) and apply PIC accordingly. PIC on PE1 should result in
> shifting traffic to ISP2. So should PIC action on PE2.
>
> [Satya] *PE1 cannot know that PE2-ISP1 link is also down, right*? If
> PE2-ISP1 not down, then for the traffic to reach ISP1, the correct
> forwarding path is from PE1 to PE2 and then to ISP1. It should not send to
> PE2 as I mentioned in the constraint earlier.
>
>
>
> As with PIC the FIB rewrite is prefix independent so no loop should form.
>
>
>
> As you said both ISPs advertise identical set of routes: "Both ISPs
> advertise the same 700k prefixes/"
>
>
>
> Only in a situation when you would apply eiBGP multipath there could be
> some micr-loop.
>
>
>
> PIC should be smart and ignore IBGP paths (if their local pref is
> preferred in steady state) if local EBGP paths exist to heal data plane
> during the fast repair. Tnen BGP will converge to the policy
> aligned selection of exist.
>
> [Satya] As I mentioned this is PIC with an additional constraint.
>
>
>
> Kind regards,
>
> Robert
>
>
>
>
>
> On Thu, Jul 27, 2023 at 9:36 AM Satya Mohanty (satyamoh) <satyamoh=
> 40cisco.com@dmarc.ietf.org> wrote:
>
> Hi Keyur and the chairs,
>
>
>
> Towards the end of my IETF presentation, the audio was coming garbled at
> my end and not at all coherent.
>
> I went over the recording today. I am replying to the two
> questions/observations.
>
>
>
> 1)  Suggestion was given to use another label mode i.e., per-prefix
> (per-vrf does not apply here).  However, using per-prefix label allocation
> would result in the inline RRs/ASBRs exhausting their label threshold
> (platform dependent  very quickly as the route scale increases (platform
> dependent upper-limit). Therefore, using per-prefix label allocation was
> ruled out in this deployment after being given due consideration.
>
>
>
> Cisco IOS-XR supports the per-nexthop-recvd-label mode for some-time now
> in Option-B ASBR and RR with nh-self use-cases, precisely for this reason.
> I believe other vendors has an equivalent mode. Idea is to take advantage
> of the optimal label allocation by this mode and simultaneously ensure fast
> convergence via BGP PIC.
>
>
>
> 2) Regarding the suggestion of not using the proposed attribute, the
> original thought was to use tunnel-encaps attribute. The problem that I saw
> is that the tunnel-encaps can have many sub-tlvs for different purposes,
> and if we wanted to restrict the advertisement of the secondary label to
> routers that do not need it, it will not be that easy as those same routers
> may need some other TLVs present in that same tunnel-encaps attribute. But,
> we do look forward to getting your inputs/suggestions on this as you
> indicated.
>
>
>
> Thanks.
>
>
>
> Best Regards,
>
> --Satya
>
>
>
>
>
>
>
> *From: *Idr <idr-bounces@ietf.org> on behalf of Satya Mohanty (satyamoh)
> <satyamoh=40cisco.com@dmarc.ietf.org>
> *Date: *Tuesday, July 11, 2023 at 9:44 PM
> *To: *Dongjie (Jimmy) <jie.dong=40huawei.com@dmarc.ietf.org>, idr@ietf.org
> <idr@ietf.org>, MEANS, ISRAEL L <im8327@att.com>, RAMADENU, PRAVEEN <
> pr9637@att.com>
> *Cc: *idr-chairs@ietf.org <idr-chairs@ietf.org>
> *Subject: *Re: [Idr] Call for IETF 117 IDR agenda items
>
> Hi Jie,
>
>
>
> We would like to request a slot of 10 minutes to present the following
> draft. Tuesday slot is preferable.
>
> https://datatracker.ietf.org/doc/draft-mohanty-idr-secondary-label/
>
>
>
> Thanks,
>
> --Satya
>
>
>
> *From: *Idr <idr-bounces@ietf.org> on behalf of Dongjie (Jimmy) <jie.dong=
> 40huawei.com@dmarc.ietf.org>
> *Date: *Tuesday, June 27, 2023 at 3:57 PM
> *To: *idr@ietf.org <idr@ietf.org>
> *Cc: *idr-chairs@ietf.org <idr-chairs@ietf.org>
> *Subject: *[Idr] Call for IETF 117 IDR agenda items
>
> Dear all,
>
>
>
> The draft agenda of IETF 117 is available at
> https://datatracker.ietf.org/meeting/117/agenda. The IDR sessions are
> scheduled as below:
>
>
>
> - Monday Session II  13:00 - 15:00 (local time)  Plaza B
>
>
>
> - Thursday Session IV 17:00 – 18:00 (local time)  Continental 4
>
>
>
> Please start to send any IDR agenda item request to me and CC the chairs (
> idr-chairs@ietf.org). Please include the name of the person who will be
> presenting, and the estimate time you'll need (including Q/A).
>
>
>
> If you plan to make a presentation, please keep in mind the IDR tradition,
> "no Internet Draft - no time slot". You should also plan to send your
> slides to me and CC the chairs no later than 24 hours prior to the IDR
> session, though earlier is better. Please number your slides for the
> benefit of remote attendees. By default your slides will be converted to
> PDF and presented from the PDF.
>
>
>
> Potential presenters may want to take a look at the checklist for
> presenting at IDR:
>
>
>
>
> https://trac.tools.ietf.org/wg/idr/trac/wiki/Checklist%20for%20presenting%20at%20an%20IDR%20meeting
>
>
>
> Best regards,
>
> Jie
>
> _______________________________________________
> Idr mailing list
> Idr@ietf.org
> https://www.ietf.org/mailman/listinfo/idr
>
> _______________________________________________
> Idr mailing list
> Idr@ietf.org
> https://www.ietf.org/mailman/listinfo/idr
>
>