Re: [Idr] [Responses for the comments during the IETF108] New Version Notification for draft-wang-idr-rd-orf-01.txt

Aijun Wang <> Wed, 12 August 2020 01:42 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id C5DC43A0E4B for <>; Tue, 11 Aug 2020 18:42:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.889
X-Spam-Status: No, score=-1.889 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_REMOTE_IMAGE=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id nHLGp6K6lgT8 for <>; Tue, 11 Aug 2020 18:42:04 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id C5EF23A0E46 for <>; Tue, 11 Aug 2020 18:42:02 -0700 (PDT)
Received: from DESKTOP2IOH5QC (unknown []) by (Hmail) with ESMTPA id 337A04765D; Wed, 12 Aug 2020 09:41:58 +0800 (CST)
From: "Aijun Wang" <>
To: "'Gyan Mishra'" <>
Cc: "'John E Drake'" <>, "'Keyur Patel'" <>, "'Robert Raszuk'" <>, "'UTTARO, JAMES'" <>, "'idr'" <>, <>
References: <> <> <> <> <00ae01d66c58$de4da280$9ae8e780$> <> <> <> <> <> <009201d66eb9$cad23ff0$6076bfd0$> <> <003701d66ef7$6d79bac0$486d3040$> <>
In-Reply-To: <>
Date: Wed, 12 Aug 2020 09:41:57 +0800
Message-ID: <00a501d67049$c455f720$4d01e560$>
MIME-Version: 1.0
Content-Type: multipart/alternative; boundary="----=_NextPart_000_00A6_01D6708C.D27F9FC0"
X-Mailer: Microsoft Outlook 16.0
Thread-Index: AQF6K9mQWDQc9cXIAl3NKDRmBs5luQKKUtWcAPHI1HcBOseJZQLPM6CZAfO5bkUCWLj1vwI8vEi2AgXg6U4BzkoregGff63SAO6QoHYB8AVuawPP5WnRqRql23A=
Content-Language: zh-cn
X-HM-Tid: 0a73e054c41f9865kuuu337a04765d
Archived-At: <>
Subject: Re: [Idr] [Responses for the comments during the IETF108] New Version Notification for draft-wang-idr-rd-orf-01.txt
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 12 Aug 2020 01:42:10 -0000

Hi, Gyan:

Theoretically, you are right.

But, don’t’ you worry that if only one of the edge points being comprised/error-configured/attacked, then all of PEs within your domain are under risk? 

Furthermore, if you control different domains, and use Option-B/C to provide the inter-AS VPN services, such risk/”trashing routes” can easily spread among these domains?

Don’t’ you think such network is too fragile?


RD-ORF is the mechanism that aims to strengthen the risk resistance capabilities.


Best Regards


Aijun Wang

China Telecom


From: Gyan Mishra [] 
Sent: Wednesday, August 12, 2020 3:54 AM
To: Aijun Wang <>
Cc: John E Drake <>rg>; Keyur Patel <>om>; Robert Raszuk <>et>; UTTARO, JAMES <>om>; idr <>rg>;
Subject: Re: [Idr] [Responses for the comments during the IETF108] New Version Notification for draft-wang-idr-rd-orf-01.txt




If you use maximum prefix on every PE-CE peering at the edge how would the PE-RR peering send “trashing routes”.  The “trashing routes” would not be present in the VPN RIB on the PE to send to the RR.  


To me that’s “Problem Solved”.


Kind Regards 




On Mon, Aug 10, 2020 at 5:20 AM Aijun Wang < <> > wrote:

Hi, Robert:


RD-ORF is mainly for restricting PE from sending trashing routes. When the source PE receives such notification info, it should find which CE is the real source of the routes. This is not the scope of this draft, but if necessary, we can add some deployment consideration for this part later.

“Prefix Limit” mechanism can only protect the edge between PE-CE, can’t be deployed within PE that peered via RR.  We should not rely solely on the edge protection.


You can also refer our discussion at



Best Regards


Aijun Wang

China Telecom


From: Robert Raszuk [ <> ] 
Sent: Monday, August 10, 2020 5:06 PM
To: Aijun Wang < <> >
Cc: Gyan Mishra < <> >; UTTARO, JAMES < <> >; Keyur Patel < <> >; John E Drake < <> >; <> ; idr < <> >
Subject: Re: [Idr] [Responses for the comments during the IETF108] New Version Notification for draft-wang-idr-rd-orf-01.txt




> Using the RD-ORF can restrict the influence within one limited scope(within one VRF, one PE, one domain or different domains)  


Please kindly observed that this is still not enough of granularity. 


As you know src VRF can have many CEs and perhaps just one of them say out of 10 is injecting trashing routes. Then IMO penalizing reachability to all 10 CEs instead of 1 is a very bad thing to do to your customer's VPN. 


RD is per VRF and not per CE and does not provide sufficient granularity to address the protection problem you are pointing out. 


Also apologies if I missed it but I do not think you have sufficiently explained what is wrong with using the prefix limit on incoming sessions to your PE. You said this is one way. If it solves the problem why to invent different way ? If it does not solve the problem please explain why. 




On Mon, Aug 10, 2020 at 3:58 AM Aijun Wang < <> > wrote:

Hi, Gyan, Jim, John and Robert:


Thanks for your detail analysis. 

Actually, we know the value of RTC mechanism, but it is not enough for controlling the route limit within one VRF.

The propagation of routes that influence the router performance can come from different VRFs and can also come from the same VRF.

RTC aims to solve the former problem, RD-ORF aims to solve the latter.


If we depend on the local discard, as Robert proposed, the route inconsistence can also occur. It is the sacrifice we should accept. 

Using the RD-ORF can restrict the influence within one limited scope(within one VRF, one PE, one domain or different domains), in controlled manner.

RD-ORF is based on the ORF mechanism, which is transferred via P2P message. Every node on the path can have its policy to determine whether or not sending this filter information to the upstream BGP peer, based on its performance criteria. 


Best Regards


Aijun Wang

China Telecom


From: <>  [ <> ] On Behalf Of Gyan Mishra
Sent: Saturday, August 8, 2020 6:39 AM
To: UTTARO, JAMES < <> >
Cc: Keyur Patel < <> >; John E Drake < <> >; <> ; Robert Raszuk < <> >; idr < <> >
Subject: Re: [Idr] [Responses for the comments during the IETF108] New Version Notification for draft-wang-idr-rd-orf-01.txt


Agreed and good points.


As you said the processing is not impacting the forwarding aspect.  Agreed.


As RTC is SAFI 132 flood control mechanism RR to PE optimization if you have a ton of SAFI stacked as you’ve mentioned the optimization applies to all the SAFI stacked onto the PE to RR peering.


RTC is an optimization in that respect and not a requirement as default RT filtering rule application eliminates all accept what is explicitly imported.  


I gave the IP VPN example but as you stated the RTC optimization can be used for a variety of SAFI but is not a requirement.


L3 VPN is a good example although it it does not really come into play as much for intra domain as in general all PEs have the same VPNs defined, however inter-as is a different story and there as that is not the case the flooding can really be cut down as stated to interesting RTs only.






On Fri, Aug 7, 2020 at 6:21 PM UTTARO, JAMES < <> > wrote:

The rationale for RTC is not to prevent paths being sent from an RR to a PE in a L3VPN. Granted that this could be a secondary benefit when routes are flooded from an RR towards a PE. Flooding of routes occurs in two cases:


a.  PE is rebooting. In this case the PE is not operational so in reality it does not matter how many paths are being sent. I have seen millions of paths  being sent to a vintage router @1999 and it rips through these paths keeping the ones it needs for the VPNs instantiated on it and discarding the rest.  Again the PE is not up so it really does not matter.

b.  A VRF is added to an existing PE that is operational. In this case the RR floods the paths towards the PE, the RE processes and keeps paths of interest. This processing in no way affects the forwarding aspect of the router. I have seen millions of paths flooded towards for this case and there is no effect on forwarding.


RTC’s assumption is that the RR topology services a set of VPNs that are localized. As an example, if VPN A is localized to California and VPN B to New York an RR topology that has two clusters East and West can deploy RTC to ensure the East cluster never has to “see” the VPN B routes and the West cluster never has to “see” the VPN A routes. 


Generally speaking VPNs are not localized in this fashion and RTC has limited value and the downside of path hiding. Assume one wants to dual purpose the RR as a stitcher and wants to “stitch” as subset of VPN A paths to VPN B and vice versa.. Path hiding would prevent this.


Where RTC makes a lot of sense to me is where the VPN is local.. Specifically Kompella VPWS spec is by definition local where each VPN has two members in a point to point topology and is also generally localized to a given region/metro/LATA.  Here RTC can reduce the number of paths on RRs in a substantial manner. In the above ex all California VPWS paths are local and there is no need to populate said paths on the East coast RRs. This also makes sense for Kompella/VPLS, EVPN/VPLS, EVPN/VPWS, and EVPN/FXC..



              Jim Uttaro


From: Idr < <> > On Behalf Of Gyan Mishra
Sent: Friday, August 07, 2020 5:32 PM
To: John E Drake < <> >
Cc: Keyur Patel < <> >; <> ; Robert Raszuk < <> >; idr < <> >
Subject: Re: [Idr] [Responses for the comments during the IETF108] New Version Notification for draft-wang-idr-rd-orf-01.txt


Hi Authors 


I agree with Roberts’s analysis as well.


As pointed out by Robert this feature appears to be a dangerous addition to the BGP specification.  The concept or idea of wanting to drop routes for select devices based on criteria that is not holistic.


BGP is a holistic routing protocol and by that I mean consistent and ubiquitous propagation of NLRI’s throughout the domain.  The concept of resource constrained nodes and creating a BGP knob for resource constrained nodes is out of the scope of the BGP protocol.  The goal and primary purpose of BGP is to provide “reachability” a by that I mean every router in the domain has a consistent view of the topology soft state and identical RIB..  This is not different then the goal of an IGP to provide next hop reachability and a consistent view of the domain meaning IGP database synchronization.  Vendors provide plenty of knobs to filer that consistent view of the domain but those are very one off exception use cases where that happens in both the IGP and EGP world.  In the IGP world for example take OSPF or ISIS you can as a corner case filter the IGP RIB that is on a router with an inbound prefix filter.  This is an exception to the rule but it must be done carefully as since the LSDB is “synchronized” every router interface on every device in the domain bust now require that same filter implemented as the database is “synchronized” for the goal of both IGP and EGP to provide a consistent identical view of the domain.  Similarly with EGP such as BGP iBGP intra domain filtering is not recommended and is not built by design to so as the goal BGO is to provide the same reachability.  You can match and manipulate BGP attributes but not filter as it will break the goal and primary purpose of BGP in the intra domain construct of a consistent identical view throughout the domain.


There is already control mechanisms built into BGP to prevent routing loops and re-advertisements such as BGP originator-id and cluster-id and cluster list path created for route-reflector-client originated routes to prevent loops as well as RR side sidenon client to non client re-advertisement controls.



>From an IP VPN framework perspective you have to understand that a PE has RT filtering enabled by default which is the control mechanism.  Their is absolutely no need for any additional resource constrained node knob to drop NLRI in what seems to be a random inconsistent fashion.  What PE “RT filtering” means and this is a very basic concept but and applied directly to what you are proposing and is the grounds to stop development of the draft.  So when an RR advertises prefixes to a PE and let’s say their are a million RTs and let’s say the PE is configured to import RT=1 and so only one RT.  So now what happens is the RR advertises 1M RTs and the PE per “RT filtering” drops 999,999 RTs as the PE only has RT=1 defined.  


This issue caused unnecessary flood of RTs that soaks up operators bandwidth that is not necessary that are going to be dropped anyway.


RTC to the rescue!! 



RTC provides a elegant function to control advertisements from RR to PE and is very robust and sensible in that it not dropping routes randomly or because a few nodes have a resource constraint.  RTC has a well defined purpose which is to cut down on a flood or RTs to a PE as described.


So the PE-RR peering had a RTC BGP MP reach AFI/SAFI defined on each peer SAFI 132 enabled on each PE-RR peering  capability is established send/receive state.  So now what happens is the PE does an ORF to the RR stating that I only have RT=1 defined for import which is sent in a RTC membership report to the RR.  The RR with RTC capability enabled now honors what the PE has stated and only sends RT=1 and not 1M RTs.  


Hooray for RTC saves the day = Job well done


This issue can be exacerbated with NNI inter-as when using option-b where all VPN routes are propagated.  In this case we have default “RT filtering” to the rescue to slow down the RT propagation to be done in a controlled fashion.  So by default on the RR node does not have RT filtering enabled by default and only PEs have that feature enabled by default. With Option-B the inter-as link is PE-PE peering connecting the two SP domains together.  So looking deeper into option-b you can set “retain-route-target” which now changes the default behavior and is required for the VPN routes to be propagated.  In some cases between providers the RT does not match so a RT-rewrite is required so in that case the rewrite policy become the RTC like filtering mechanism.  In case where RT is the same between providers we tie the “retain-route-target all” to a RT filter “RTC like” to only accept RTs pertinent to the domain.


With inter-as option C we run BGP LU for labels unicast end loopback FEC all leaked between domains via eBGP and IBGP LU for data plane forwarding and the control plane function vpn peer that was on the PE-PE in option-b is RR-RR eBGP vpnv4 vpnv6 SAFI 128.  Since the RR sits in the control plane not the data plane “RT filtering” is not enabled by Default.  So for the proper end to e f LSP data plane forwarding we have to set BGP “next-hop-unchanged” so that the next hop attribute is now the ingress and egress PEs between domains for the end to end LSP.  So now in this case all the SAFI 128 RTs are propagated between domains.  You can apply explicit RT filtering in this scenario but generally that is not done as opt-c used mostly within same administrative domain or special agreement scenario between SPs that work closely together or in M&A activity type scenarios..  So now we still have RTC SAFI 132 eatabled PE-RR within each domain so even if you got a quadrillion RTs on the RR-RR peering and did not want to filter there we have RTC back to the rescue and if you only have 1M RTs defined for import on the PE within each domain,  now in a graceful controlled fashion only 1M RTs are not advertised RR to PE within each domain.


This same concept as IP VPN is an “overlay” it applies to SR both SR-MPLS and SRv6 as well.


>From an BGP architecture and designer perspective when building a “SP core” in that standard framework where you have separation of control plane and data plane where the RRs are dedicated control plane function and don’t sit in the data plane where the PEs RRC clients sit at the edge sit in the data plane for ingress to egress LSP to FEC to be built.   


So in a nutshell, a combination of default RT filtering on PEs, and controlled RT advertisements via ORF RT membership via RTC we have complete control of SAFI 128 advertisements for resource constrained nodes as it stands today and there is absolutely no need for this draft to be considered with IDR.


I did not mention the per VRF maximum prefix knobs to control how big each VPN gets at a micro level so we have that control mechanism as well.  In general most providers have this set as a rule of thumb per VRF maximum which is sized accordingly to the lowest resource constrained node within an operators network and is publish as the SLA customer agreement contract as part of the VPN service offering.


Kind Regards 




On Fri, Aug 7, 2020 at 8:35 AM John E Drake < <> > wrote:



I agree with Robert’s analysis.  There is no need to define an external mechanism to deal with an internal node issue. 


Yours Irrespectively,





Juniper Business Use Only

From: Idr < <> > On Behalf Of Robert Raszuk
Sent: Friday, August 7, 2020 3:45 AM
To: Aijun Wang < <> >
Cc: Keyur Patel < <> >; <> ; idr < <> >
Subject: Re: [Idr] [Responses for the comments during the IETF108] New Version Notification for draft-wang-idr-rd-orf-01.txt


[External Email. Be cautious of content]



 > Using RD-ORF can control the VPN routes limits within each VRF.


If you really want to create such inconsistency within any VPN you can ask your vendor for a knob to locally drop on inbound received routes with specific RD. That also can be fully automated by setting the auto trigger threshold on a per VRF. 


The inbound drop is a very low cost operation so will be of no CPU issue.


You will then be able to use such *local* knob as an additional safety fuse. 


There is however no need to propagate this anywhere nor to pass it from RR to upstream PEs. Such knob can exist on RRs as well if you even further want to completely break your user's happiness. 


I see no need and I stay by original position that this proposed protocol extension is a bad one and I do not support it to proceed any further nor to be adopted as an IDR WG document. 


Kind regards,




On Fri, Aug 7, 2020 at 3:20 AM Aijun Wang < <> > wrote:

Hi, Robert and Gyan:


The problem is similar as that for “BGP Maximum-Prefix” and should be control plane related issue.

No matter what’s the capabilities of the router, there is always the roof for its performance.  

And in large network deployment, there still exist the possibility that one or some PEs are misconfigured/attacked etc.  

We should not only control the PE behavior at network edge(for example, using BGP Maximum-Prefix feature), but also need to consider the risk within the network.   


RTC is one kind of method to control the routes propagation within the network, but it is not enough. 

RTC can filter the VRF routes it doesn’t want to, but it can’t suppress the VPN routes it wants.  Using RD-ORF can control the VPN routes limits within each VRF. 



Best Regards


Aijun Wang

China Telecom


From: Robert Raszuk [ <> ] 
Sent: Friday, August 7, 2020 7:05 AM
To: Aijun Wang < <> >; <> 
Cc: Keyur Patel < <> >; idr < <> >; Gyan Mishra < <> >
Subject: Re: [Idr] [Responses for the comments during the IETF108] New Version Notification for draft-wang-idr-rd-orf-01.txt




I think we need to step back and first understand what is the problem.. 


Statements "overload of the PE" .. "overwhelmed PE" etc ... are really not helpful. Neither are sufficient to justify this work based on the statement "we need more then one solution" etc ..... 


What exactly is being considered a problem here ? 


1. Running out of RAM in control plane ? 

2. Running out of CPU in control plane ? 

3. Not being able to import VPNv4 routes to a VRFs due to not enough control plane memory ? 

4. Not being able to import VPNv4 routes to a VRFs due to not enough data plane memory ?   


And please do not say "all of the above" as it will also not much productive. 


While stating the above please indicate which vendors have been tested and do not meet feature wise sufficient protection. 


Many thx,





On Fri, Aug 7, 2020 at 12:50 AM Gyan Mishra < <> > wrote:



On Thu, Aug 6, 2020 at 7:29 AM Aijun Wang < <> > wrote:

Hi, Gyan and Robert:


Maximum Prefix Limit is one method to control the routes between PE and CE, but we should not depend only on it. 


   You can use maximum prefix along with an inbound filter prefix list or as path or community match for whatever style routing control desired.


>From a PE standpoint you have not explained why using the per VRF prefix limit to limit the size of the per VRF rib.  Please explain why that does not solve the problem.


RD-ORF can limit the influence scope of misbehavior PE as small as possible..


    I don’t think you want to use RD-ORF as that would drop all routes from a PE and RTC does the job well only advertising RTs imported on the PE.  I think it would be rare if ever that anyone would ever filter on RD.


RT-ORF can suppress the routes from unwanted VRFs, but can’t suppress the overflow routes in VRF that it is interested.




More details responses are inline below.

Aijun Wang

China Telecom


On Aug 6, 2020, at 18:02, Gyan Mishra < <> > wrote:


Hi Robert 


I am in agreement as you stated that most service providers from my experience use the per VRF prefix limit to protect resources.  Problem solved as you said 20+ years ago.  


That is a general rule of thumb for any service providers to perform due diligence on their PE memory resource carving per VRF and set it appropriately based on platform and total number of VRFs to account for.  


Problem solved on the SP end.


On the customer end, they can also use the maximum prefix peer command as well to prevent flood of routes in case of unwanted advertisements from unintentional VRF leaking by providers.


Kind Regards 





On Thu, Aug 6, 2020 at 5:49 AM Robert Raszuk < <> > wrote:

Hi Gyan,


Thank you for your comments - all very valid observations. 


Just to perhaps clarify one thing ... Problem authors are attempting to address - the way I understand it - is that given resource may be suffering from actually legitimate VPN routes hence to use RTC indeed a lot of additional RTs would have to be applied. 


But I do not understand why authors fail to recognize that solution for their problem has been invented and implemented over 20 years ago already. The solution is to control on a per *ingress* VRF basis number of VPN routes customer is authorized to inject into his VPN with eBGP PE-CE prefix limit. 


[WAJ] we have mentioned prefix limit solutions in the draft and analysis its applicability scope.


Most SPs offering L3VPNs use prefix limit successfully to protect their shared resources for vast majority of customers and deployments. For VPN customers with unpredictable amount of routing CSC model should be used instead. 


By all means filtering and dropping accepted into SP network VPN route should not take place. 








On Thu, Aug 6, 2020 at 11:41 AM Gyan Mishra < <> > wrote:

Hi Aijun


I agree with Robert that you cannot filter by RD or you would drop all the routes and filtering must be done by RT.  Also the issue with RT ORF filter is as Robert mentioned that you may have the same prefix with two different RTs which is common unique by RD and so the ORF would drop the prefixes.  


[WAJ] Such situation can only happen at the extraVPN scenarios which should be designed carefully——one must keep the prefixes in these VRFs not overlap. But if the prefixes in these VRFs are not overlap, why do we need to separate them in different VRFs?

In conclusion, this is just corner case and should be avoided in design.




I am not sure I understand what problem you are trying to solve that is not already solved by RTC membership so that only RTs imported by the PE are what is advertised by the RR.  That i



Gyan Mishra

Network Solutions Architect 

M 301 502-1347
13101 Columbia Pike 
Silver Spring, MD