Re: [Idr] [Responses for the comments during the IETF108] New Version Notification for draft-wang-idr-rd-orf-01.txt

"UTTARO, JAMES" <> Fri, 07 August 2020 22:21 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id B18113A0CEB for <>; Fri, 7 Aug 2020 15:21:26 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.796
X-Spam-Status: No, score=-1.796 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id j4nzGzO9jmHp for <>; Fri, 7 Aug 2020 15:21:22 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 8C6E63A0CD6 for <>; Fri, 7 Aug 2020 15:21:22 -0700 (PDT)
Received: from pps.filterd ( []) by ( with SMTP id 077MBmGX026814; Fri, 7 Aug 2020 18:21:12 -0400
Received: from ( []) by with ESMTP id 32revj84gg-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Fri, 07 Aug 2020 18:21:11 -0400
Received: from (localhost []) by (8.14.5/8.14.5) with ESMTP id 077ML9ud000384; Fri, 7 Aug 2020 18:21:10 -0400
Received: from ( []) by (8.14.5/8.14.5) with ESMTP id 077ML4Um032711 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NO); Fri, 7 Aug 2020 18:21:04 -0400
Received: from ( []) by (Service) with ESMTP id 81F05400B574; Fri, 7 Aug 2020 22:21:04 +0000 (GMT)
Received: from (unknown []) by (Service) with ESMTPS id 3B56E400B570; Fri, 7 Aug 2020 22:21:04 +0000 (GMT)
Received: from ( by ( with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2044.4; Fri, 7 Aug 2020 18:21:03 -0400
Received: from ([]) by ([]) with mapi id 15.01.2044.004; Fri, 7 Aug 2020 18:21:03 -0400
From: "UTTARO, JAMES" <>
To: Gyan Mishra <>, John E Drake <>
CC: Keyur Patel <>, "" <>, Robert Raszuk <>, idr <>
Thread-Topic: [Idr] [Responses for the comments during the IETF108] New Version Notification for draft-wang-idr-rd-orf-01.txt
Date: Fri, 7 Aug 2020 22:21:03 +0000
Message-ID: <>
References: <> <> <> <> <00ae01d66c58$de4da280$9ae8e780$> <> <> <>
In-Reply-To: <>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach: yes
x-originating-ip: []
x-tm-snts-smtp: FE87DC1C98AA8D9C19484ED2072D53EDBEA665FE80FF2D4599F157409A547E332
Content-Type: multipart/related; boundary="_005_67ef32c7d3aa43419382f9398ce1dc69attcom_"; type="multipart/alternative"
MIME-Version: 1.0
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.235, 18.0.687 definitions=2020-08-07_20:2020-08-06, 2020-08-07 signatures=0
X-Proofpoint-Spam-Details: rule=outbound_policy_notspam policy=outbound_policy score=0 adultscore=0 phishscore=0 mlxlogscore=999 clxscore=1011 bulkscore=0 priorityscore=1501 suspectscore=0 lowpriorityscore=0 malwarescore=0 mlxscore=0 spamscore=0 impostorscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2006250000 definitions=main-2008070150
Archived-At: <>
Subject: Re: [Idr] [Responses for the comments during the IETF108] New Version Notification for draft-wang-idr-rd-orf-01.txt
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Fri, 07 Aug 2020 22:21:27 -0000

The rationale for RTC is not to prevent paths being sent from an RR to a PE in a L3VPN. Granted that this could be a secondary benefit when routes are flooded from an RR towards a PE. Flooding of routes occurs in two cases:

  1.  PE is rebooting. In this case the PE is not operational so in reality it does not matter how many paths are being sent. I have seen millions of paths  being sent to a vintage router @1999 and it rips through these paths keeping the ones it needs for the VPNs instantiated on it and discarding the rest.  Again the PE is not up so it really does not matter.
  2.  A VRF is added to an existing PE that is operational. In this case the RR floods the paths towards the PE, the RE processes and keeps paths of interest. This processing in no way affects the forwarding aspect of the router. I have seen millions of paths flooded towards for this case and there is no effect on forwarding.

RTC’s assumption is that the RR topology services a set of VPNs that are localized. As an example, if VPN A is localized to California and VPN B to New York an RR topology that has two clusters East and West can deploy RTC to ensure the East cluster never has to “see” the VPN B routes and the West cluster never has to “see” the VPN A routes.

Generally speaking VPNs are not localized in this fashion and RTC has limited value and the downside of path hiding. Assume one wants to dual purpose the RR as a stitcher and wants to “stitch” as subset of VPN A paths to VPN B and vice versa.. Path hiding would prevent this.

Where RTC makes a lot of sense to me is where the VPN is local.. Specifically Kompella VPWS spec is by definition local where each VPN has two members in a point to point topology and is also generally localized to a given region/metro/LATA.  Here RTC can reduce the number of paths on RRs in a substantial manner. In the above ex all California VPWS paths are local and there is no need to populate said paths on the East coast RRs. This also makes sense for Kompella/VPLS, EVPN/VPLS, EVPN/VPWS, and EVPN/FXC..

              Jim Uttaro

From: Idr <> On Behalf Of Gyan Mishra
Sent: Friday, August 07, 2020 5:32 PM
To: John E Drake <>
Cc: Keyur Patel <>om>;; Robert Raszuk <>et>; idr <>
Subject: Re: [Idr] [Responses for the comments during the IETF108] New Version Notification for draft-wang-idr-rd-orf-01.txt

Hi Authors

I agree with Roberts’s analysis as well.

As pointed out by Robert this feature appears to be a dangerous addition to the BGP specification.  The concept or idea of wanting to drop routes for select devices based on criteria that is not holistic.

BGP is a holistic routing protocol and by that I mean consistent and ubiquitous propagation of NLRI’s throughout the domain.  The concept of resource constrained nodes and creating a BGP knob for resource constrained nodes is out of the scope of the BGP protocol.  The goal and primary purpose of BGP is to provide “reachability” a by that I mean every router in the domain has a consistent view of the topology soft state and identical RIB.  This is not different then the goal of an IGP to provide next hop reachability and a consistent view of the domain meaning IGP database synchronization.  Vendors provide plenty of knobs to filer that consistent view of the domain but those are very one off exception use cases where that happens in both the IGP and EGP world.  In the IGP world for example take OSPF or ISIS you can as a corner case filter the IGP RIB that is on a router with an inbound prefix filter.  This is an exception to the rule but it must be done carefully as since the LSDB is “synchronized” every router interface on every device in the domain bust now require that same filter implemented as the database is “synchronized” for the goal of both IGP and EGP to provide a consistent identical view of the domain.  Similarly with EGP such as BGP iBGP intra domain filtering is not recommended and is not built by design to so as the goal BGO is to provide the same reachability.  You can match and manipulate BGP attributes but not filter as it will break the goal and primary purpose of BGP in the intra domain construct of a consistent identical view throughout the domain.

There is already control mechanisms built into BGP to prevent routing loops and re-advertisements such as BGP originator-id and cluster-id and cluster list path created for route-reflector-client originated routes to prevent loops as well as RR side sidenon client to non client re-advertisement controls.

From an IP VPN framework perspective you have to understand that a PE has RT filtering enabled by default which is the control mechanism.  Their is absolutely no need for any additional resource constrained node knob to drop NLRI in what seems to be a random inconsistent fashion.  What PE “RT filtering” means and this is a very basic concept but and applied directly to what you are proposing and is the grounds to stop development of the draft.  So when an RR advertises prefixes to a PE and let’s say their are a million RTs and let’s say the PE is configured to import RT=1 and so only one RT.  So now what happens is the RR advertises 1M RTs and the PE per “RT filtering” drops 999,999 RTs as the PE only has RT=1 defined.

This issue caused unnecessary flood of RTs that soaks up operators bandwidth that is not necessary that are going to be dropped anyway.

RTC to the rescue!!

RTC provides a elegant function to control advertisements from RR to PE and is very robust and sensible in that it not dropping routes randomly or because a few nodes have a resource constraint.  RTC has a well defined purpose which is to cut down on a flood or RTs to a PE as described.

So the PE-RR peering had a RTC BGP MP reach AFI/SAFI defined on each peer SAFI 132 enabled on each PE-RR peering  capability is established send/receive state.  So now what happens is the PE does an ORF to the RR stating that I only have RT=1 defined for import which is sent in a RTC membership report to the RR.  The RR with RTC capability enabled now honors what the PE has stated and only sends RT=1 and not 1M RTs.

Hooray for RTC saves the day = Job well done

This issue can be exacerbated with NNI inter-as when using option-b where all VPN routes are propagated.  In this case we have default “RT filtering” to the rescue to slow down the RT propagation to be done in a controlled fashion.  So by default on the RR node does not have RT filtering enabled by default and only PEs have that feature enabled by default. With Option-B the inter-as link is PE-PE peering connecting the two SP domains together.  So looking deeper into option-b you can set “retain-route-target” which now changes the default behavior and is required for the VPN routes to be propagated.  In some cases between providers the RT does not match so a RT-rewrite is required so in that case the rewrite policy become the RTC like filtering mechanism.  In case where RT is the same between providers we tie the “retain-route-target all” to a RT filter “RTC like” to only accept RTs pertinent to the domain.

With inter-as option C we run BGP LU for labels unicast end loopback FEC all leaked between domains via eBGP and IBGP LU for data plane forwarding and the control plane function vpn peer that was on the PE-PE in option-b is RR-RR eBGP vpnv4 vpnv6 SAFI 128.  Since the RR sits in the control plane not the data plane “RT filtering” is not enabled by Default.  So for the proper end to e f LSP data plane forwarding we have to set BGP “next-hop-unchanged” so that the next hop attribute is now the ingress and egress PEs between domains for the end to end LSP.  So now in this case all the SAFI 128 RTs are propagated between domains.  You can apply explicit RT filtering in this scenario but generally that is not done as opt-c used mostly within same administrative domain or special agreement scenario between SPs that work closely together or in M&A activity type scenarios..  So now we still have RTC SAFI 132 eatabled PE-RR within each domain so even if you got a quadrillion RTs on the RR-RR peering and did not want to filter there we have RTC back to the rescue and if you only have 1M RTs defined for import on the PE within each domain,  now in a graceful controlled fashion only 1M RTs are not advertised RR to PE within each domain.

This same concept as IP VPN is an “overlay” it applies to SR both SR-MPLS and SRv6 as well.

From an BGP architecture and designer perspective when building a “SP core” in that standard framework where you have separation of control plane and data plane where the RRs are dedicated control plane function and don’t sit in the data plane where the PEs RRC clients sit at the edge sit in the data plane for ingress to egress LSP to FEC to be built.

So in a nutshell, a combination of default RT filtering on PEs, and controlled RT advertisements via ORF RT membership via RTC we have complete control of SAFI 128 advertisements for resource constrained nodes as it stands today and there is absolutely no need for this draft to be considered with IDR.

I did not mention the per VRF maximum prefix knobs to control how big each VPN gets at a micro level so we have that control mechanism as well.  In general most providers have this set as a rule of thumb per VRF maximum which is sized accordingly to the lowest resource constrained node within an operators network and is publish as the SLA customer agreement contract as part of the VPN service offering.

Kind Regards


On Fri, Aug 7, 2020 at 8:35 AM John E Drake <<>> wrote:

I agree with Robert’s analysis.  There is no need to define an external mechanism to deal with an internal node issue.

Yours Irrespectively,


Juniper Business Use Only
From: Idr <<>> On Behalf Of Robert Raszuk
Sent: Friday, August 7, 2020 3:45 AM
To: Aijun Wang <<>>
Cc: Keyur Patel <<>>;<>; idr <<>>
Subject: Re: [Idr] [Responses for the comments during the IETF108] New Version Notification for draft-wang-idr-rd-orf-01.txt

[External Email. Be cautious of content]

 > Using RD-ORF can control the VPN routes limits within each VRF.

If you really want to create such inconsistency within any VPN you can ask your vendor for a knob to locally drop on inbound received routes with specific RD. That also can be fully automated by setting the auto trigger threshold on a per VRF.

The inbound drop is a very low cost operation so will be of no CPU issue.

You will then be able to use such *local* knob as an additional safety fuse.

There is however no need to propagate this anywhere nor to pass it from RR to upstream PEs. Such knob can exist on RRs as well if you even further want to completely break your user's happiness.

I see no need and I stay by original position that this proposed protocol extension is a bad one and I do not support it to proceed any further nor to be adopted as an IDR WG document.

Kind regards,

On Fri, Aug 7, 2020 at 3:20 AM Aijun Wang <<>> wrote:
Hi, Robert and Gyan:

The problem is similar as that for “BGP Maximum-Prefix” and should be control plane related issue.
No matter what’s the capabilities of the router, there is always the roof for its performance.
And in large network deployment, there still exist the possibility that one or some PEs are misconfigured/attacked etc.
We should not only control the PE behavior at network edge(for example, using BGP Maximum-Prefix feature), but also need to consider the risk within the network.

RTC is one kind of method to control the routes propagation within the network, but it is not enough.
RTC can filter the VRF routes it doesn’t want to, but it can’t suppress the VPN routes it wants.  Using RD-ORF can control the VPN routes limits within each VRF.

Best Regards

Aijun Wang
China Telecom

From: Robert Raszuk [<>]
Sent: Friday, August 7, 2020 7:05 AM
To: Aijun Wang <<>>;<>
Cc: Keyur Patel <<>>; idr <<>>; Gyan Mishra <<>>
Subject: Re: [Idr] [Responses for the comments during the IETF108] New Version Notification for draft-wang-idr-rd-orf-01.txt


I think we need to step back and first understand what is the problem..

Statements "overload of the PE" .. "overwhelmed PE" etc ... are really not helpful. Neither are sufficient to justify this work based on the statement "we need more then one solution" etc ....

What exactly is being considered a problem here ?

1. Running out of RAM in control plane ?
2. Running out of CPU in control plane ?
3. Not being able to import VPNv4 routes to a VRFs due to not enough control plane memory ?
4. Not being able to import VPNv4 routes to a VRFs due to not enough data plane memory ?

And please do not say "all of the above" as it will also not much productive.

While stating the above please indicate which vendors have been tested and do not meet feature wise sufficient protection.

Many thx,

On Fri, Aug 7, 2020 at 12:50 AM Gyan Mishra <<>> wrote:

On Thu, Aug 6, 2020 at 7:29 AM Aijun Wang <<>> wrote:
Hi, Gyan and Robert:

Maximum Prefix Limit is one method to control the routes between PE and CE, but we should not depend only on it.

   You can use maximum prefix along with an inbound filter prefix list or as path or community match for whatever style routing control desired.

From a PE standpoint you have not explained why using the per VRF prefix limit to limit the size of the per VRF rib.  Please explain why that does not solve the problem.

RD-ORF can limit the influence scope of misbehavior PE as small as possible.

    I don’t think you want to use RD-ORF as that would drop all routes from a PE and RTC does the job well only advertising RTs imported on the PE.  I think it would be rare if ever that anyone would ever filter on RD.

RT-ORF can suppress the routes from unwanted VRFs, but can’t suppress the overflow routes in VRF that it is interested.

More details responses are inline below.
Aijun Wang
China Telecom

On Aug 6, 2020, at 18:02, Gyan Mishra <<>> wrote:

Hi Robert

I am in agreement as you stated that most service providers from my experience use the per VRF prefix limit to protect resources.  Problem solved as you said 20+ years ago.

That is a general rule of thumb for any service providers to perform due diligence on their PE memory resource carving per VRF and set it appropriately based on platform and total number of VRFs to account for.

Problem solved on the SP end.

On the customer end, they can also use the maximum prefix peer command as well to prevent flood of routes in case of unwanted advertisements from unintentional VRF leaking by providers.

Kind Regards


On Thu, Aug 6, 2020 at 5:49 AM Robert Raszuk <<>> wrote:
Hi Gyan,

Thank you for your comments - all very valid observations.

Just to perhaps clarify one thing ... Problem authors are attempting to address - the way I understand it - is that given resource may be suffering from actually legitimate VPN routes hence to use RTC indeed a lot of additional RTs would have to be applied.

But I do not understand why authors fail to recognize that solution for their problem has been invented and implemented over 20 years ago already. The solution is to control on a per *ingress* VRF basis number of VPN routes customer is authorized to inject into his VPN with eBGP PE-CE prefix limit.

[WAJ] we have mentioned prefix limit solutions in the draft and analysis its applicability scope.

Most SPs offering L3VPNs use prefix limit successfully to protect their shared resources for vast majority of customers and deployments. For VPN customers with unpredictable amount of routing CSC model should be used instead.

By all means filtering and dropping accepted into SP network VPN route should not take place.


On Thu, Aug 6, 2020 at 11:41 AM Gyan Mishra <<>> wrote:
Hi Aijun

I agree with Robert that you cannot filter by RD or you would drop all the routes and filtering must be done by RT.  Also the issue with RT ORF filter is as Robert mentioned that you may have the same prefix with two different RTs which is common unique by RD and so the ORF would drop the prefixes.

[WAJ] Such situation can only happen at the extraVPN scenarios which should be designed carefully——one must keep the prefixes in these VRFs not overlap. But if the prefixes in these VRFs are not overlap, why do we need to separate them in different VRFs?
In conclusion, this is just corner case and should be avoided in design.

I am not sure I understand what problem you are trying to solve that is not already solved by RTC membership so that only RTs imported by the PE are what is advertised by the RR.  That is most effective way of cutting down the RT flooding that occurs in the RR to PE advertisement.  RT filtering is enabled by Default on all PEs and only if the RT is imported on the PE are the RTs accepted into the vpn rib. That works pretty well in cutting down RT advertisements by the RR.

As Robert mentioned each VRF has a maximum prefix which is defined on the PE RIBs per VRF and in general on most current or even hardware within the last 10 years is a minimum 1M prefixes per VRF is pretty standard with most vendors and platforms.  The vpn rib limit is much much higher on the higher end platforms.

You draft talks about inter-as issues solved with RT-ORF.  So when PE-PE inter-as option B by default all RTs are dropped due to default RT filtering and only RTs that are accepted are those RTS that are explicitly being imported on the PE ASBR.  There is an option for retain route-target all that disabled the default RT filtering so that all VPN routes can be accepted on the inter-as option B link.  However a RT filter can still be applied to the retian-route-target all so that only pertinent RTs are accepted inter domain.  That seems to work pretty well.

As far as inter-as option C, the PE-PE ASBRs do not maintain the VPN RIB.  BGP LU is enabled on the inter-as link for end to end LSP by importing the loopback between ASs for the end to end LSPto be built.   The RRs between the SPs have eBGP VPN IPv4 VPN IPV6 peer with next hop unchanged so the data plane gets built between the PEs.  The RR by default does not have RT filtering enabled by default as does the PE, so is able to reflect all the vpn routes learned to all PEs within each AS.  In the inter-as scenario as well RTC works very well with the RT membership to cut down on RR to PE vpn route advertisements.

Kind Regards


On Wed, Aug 5, 2020 at 12:49 PM Aijun Wang <<>> wrote:
Hi, Robert:

Aijun Wang
China Telecom

On Aug 6, 2020, at 00:14, Robert Raszuk <<>> wrote:

[WAJ] The VPN routes imported in these VRFs can’t use the same RD, or else, the VPN prefixes in different VRFs will collision on RR.

Nothing will "collide" on RRs.

NLRI = RD+Prefix  not just the RD.

[WAJ] The prefix part can be overlap in different VRF. If the RD is same, RD+Prefix will also be overlap.
We must make sure different VRF use different RD to make the VPN prefixes unique within the domain.

So you may have completely different prefixes sourced by the same VRF going to completely different VRFs on same or different PEs.

[WAJ] This situation is for extraVPN communication, and should be designed carefully to avoid the address collision..
If the address space in different VRF need to be considered in such manner, putting them in one VRF may be more straightforward.

Kind regards,

Idr mailing list<><;!!NEt6yMaO-gk!SSHG20ZmwUSGrSuTtfa4JNe7b713829oIhwBxGXTcW3O1752RosOFtA-a4DS9kU$>

[Image removed by sender.]<;!!NEt6yMaO-gk!SSHG20ZmwUSGrSuTtfa4JNe7b713829oIhwBxGXTcW3O1752RosOFtA-4CH5VmU$>

Gyan Mishra

Network Solutions Architect

M 301 502-1347
13101 Columbia Pike <*Columbia*Pike**BSilver*Spring,*MD?entry=gmail&source=g__;KyvCoCsrKw!!NEt6yMaO-gk!SSHG20ZmwUSGrSuTtfa4JNe7b713829oIhwBxGXTcW3O1752RosOFtA-TvDTsPA$>
Silver Spring, MD<*Columbia*Pike**BSilver*Spring,*MD?entry=gmail&source=g__;KyvCoCsrKw!!NEt6yMaO-gk!SSHG20ZmwUSGrSuTtfa4JNe7b713829oIhwBxGXTcW3O1752RosOFtA-TvDTsPA$>


[Image removed by sender.]<;!!NEt6yMaO-gk!SSHG20ZmwUSGrSuTtfa4JNe7b713829oIhwBxGXTcW3O1752RosOFtA-4CH5VmU$>

Gyan Mishra

Network Solutions Architect

M 301 502-1347
13101 Columbia Pike<*Columbia*Pike*Silver*Spring,*MD?entry=gmail&source=g__;KysrKys!!NEt6yMaO-gk!SSHG20ZmwUSGrSuTtfa4JNe7b713829oIhwBxGXTcW3O1752RosOFtA-4bIdg-w$>
Silver Spring, MD<*Columbia*Pike*Silver*Spring,*MD?entry=gmail&source=g__;KysrKys!!NEt6yMaO-gk!SSHG20ZmwUSGrSuTtfa4JNe7b713829oIhwBxGXTcW3O1752RosOFtA-4bIdg-w$>


[Image removed by sender.]<;!!NEt6yMaO-gk!SSHG20ZmwUSGrSuTtfa4JNe7b713829oIhwBxGXTcW3O1752RosOFtA-4CH5VmU$>

Gyan Mishra

Network Solutions Architect

M 301 502-1347
13101 Columbia Pike<>
Silver Spring, MD<>

Idr mailing list<><>

[Image removed by sender.]<>

Gyan Mishra

Network Solutions Architect

M 301 502-1347
13101 Columbia Pike
Silver Spring, MD