Re: [Idr] WG Adoption call for draft-wang-idr-rd-orf-05.txt (2/4/2021 to 2/18/2021)

Susan Hares <> Mon, 15 February 2021 14:47 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 8D6B53A0BBC for <>; Mon, 15 Feb 2021 06:47:35 -0800 (PST)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: 0.414
X-Spam-Status: No, score=0.414 tagged_above=-999 required=5 tests=[HTML_MESSAGE=0.001, KHOP_HELO_FCRDNS=0.4, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_DOS_OUTLOOK_TO_MX_IMAGE=0.01, URIBL_BLOCKED=0.001] autolearn=no autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id MKaL_hUg7QOH for <>; Mon, 15 Feb 2021 06:47:32 -0800 (PST)
Received: from ( []) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 8FF1D3A0BBA for <>; Mon, 15 Feb 2021 06:47:32 -0800 (PST)
X-Default-Received-SPF: pass (skip=forwardok (res=PASS)) x-ip-name=;
From: Susan Hares <>
To: 'Robert Raszuk' <>, 'Gyan Mishra' <>
Cc: 'Aijun Wang' <>, "'Jakob Heitz (jheitz)'" <>, "'idr@ietf. org'" <>
References: <> <> <> <>
In-Reply-To: <>
Date: Mon, 15 Feb 2021 09:47:17 -0500
Message-ID: <012801d703a9$752d2c10$5f878430$>
MIME-Version: 1.0
Content-Type: multipart/related; boundary="----=_NextPart_000_0129_01D7037F.8C5AA680"
X-Mailer: Microsoft Outlook 14.0
Thread-Index: AQJA4S2YXol+tuqraLmbEfWZnLKo3wE5N2gfAY5UIssB7LQ4Jalf7itA
Content-Language: en-us
X-Antivirus: AVG (VPS 210214-4, 02/14/2021), Outbound message
X-Antivirus-Status: Not-Tested
Archived-At: <>
Subject: Re: [Idr] WG Adoption call for draft-wang-idr-rd-orf-05.txt (2/4/2021 to 2/18/2021)
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Mon, 15 Feb 2021 14:47:36 -0000



Do you feel the problem has been correctly described by Robert?  If not, please comment on problem. 


Cheers, Sue


From: Robert Raszuk [] 
Sent: Saturday, February 13, 2021 6:42 AM
To: Gyan Mishra
Cc: Aijun Wang; Jakob Heitz (jheitz); Susan Hares; idr@ietf. org
Subject: Re: [Idr] WG Adoption call for draft-wang-idr-rd-orf-05.txt (2/4/2021 to 2/18/2021)




>   The problem we are trying to solve is a scenario where you have an offending PE that is flooding routes and a weak PE that is overwhelmed by a flood of routes.


The problem is a valid problem. But the proposed solution to the problem is not. And moreover solutions to this very problem are widely known and used for years. 


The problem as a matter of fact has nothing to do with VPNs of any sort. 


If you are ISP offering Internet transport unless you apply proper protection you would be badly exposed. Your clients or peers or upstreams injecting millions of routes and melting your network and perhaps even global Internet (if they would use a registered block). 


And yes BGP ingress policy here is used to filter out junk before it enters any network. The same policy must be used in VPN cases too. 


Indeed years have passed and I think I have only seen in a very few cases that operators offering L3VPNs are doing prefix ingress filtering ... Max prefix on ingress is used much more often. Those are the right tools here to work on. I am not saying we should not invent more ... we should.




* Customise secure BGP to work in VPN cases as example.   


* Augment RRs to be a bit more intelligent with pitch of ML - if number of routes with given RD for time Tx is R moment you receive 100*R you suspend those and raise NOC alarm before spraying everywhere


* Do not put VPN customer routes into your data plane ... just handle next hops. Redefine RFC4364 all together and use IP transport for it. 


etc ... 


Focus on not allowing the meltdown is the proper solution space ... But here instead we do nothing to prevent the fire to start and instead focus on tools to extinguish it. Wrong approach. And specifically this tool (RD-ORF) does way too much damage when used. 










On Sat, Feb 13, 2021 at 4:16 AM Gyan Mishra <> wrote:




>From Susan Hares summary of where we are at with the adoption call let’s start with the problem this draft is trying to solve and gaining consensus.  Once we gain consensus we can get back to RD-ORF solution.  See w


a) the problem this draft is drafting to solve relating to BGP routes,

The problem we are trying to solve is a scenario where you have an offending PE that is flooding routes and a weak PE that is overwhelmed by a flood of routes.  This is not a normal situation and is an outage situation where the weak PE being overwhelmed by a flood of routes.  Do we all agree to the problem statement?


Why and why not?


b) the need for additional mechanisms to solve the problem,

Do other methods exist that can solve the problem and if not do we need a new mechanism to solve this?

RTC, Peer maximum prefix, VPN maximum prefix

c) a clear description of the technology to solve the problem.


Do we all agree that in a normal situation we would never filter on RD as that would partition the VPN which is unwanted and what Robert mentioned.  As this is not a normal situation but a unique situation where a weak PE is overwhelmed by a flood of routes.  How best can this be solved?




On Fri, Feb 12, 2021 at 10:32 AM Aijun Wang <> wrote:

Hi, Robert:

Yes, the behavior of the device should be determined. There maybe several factors to be considered for this local behavior, we should describe it more clearly in this section later.

We have discussed the differences between RTC and RD-ORF a lot. As Haibo mentioned, they are not exclusive to each other, and can be used together in some situations. But they are different and can’t replace each other.

Aijun Wang

China Telecom

On Feb 12, 2021, at 23:04, Robert Raszuk <> wrote:

Sorry Aijun,


What you say is just handwaving. There is no room for it in any spec. 


When code is written PE must deterministically behave so the RR or any other network element. 


Statements "decisions of PE2 to judge" are not acceptable in protocol design. 


Just imagine that each PE does what it feels like in a distributed network .... Same for BGP same for IGP etc .... 


And all of this is not needed if on ingress between PE1 and HQ1 you apply max prefix of 2 or even 100. It is also not needed if you enable  RTC to send RT:TO_HUB from PE2 to RR.


But I understand - no matter what we say or how much we spend time to explain why this idea is a bad idea you are still going to push this fwd. Oh well ...   If I were you I would spend this time to redefine L3VPN such that customer routes are never needed to be sent to SP core routers. 





On Fri, Feb 12, 2021 at 3:47 PM Aijun Wang <> wrote:

Hi, Robert: has described such situations, which will require the additional local decisions of PE2 to judge whether to send the RD-ORF message out.

In your example, if only the HUB VRF exceed but the resources of PE2 is not exhausted, then the PE2 will not send the RD-ORF message. It may just discard the excessive 100000/32 routes.

If the resources of PE2 is nearly exhausted, it must send the RD-ORF message out. Or else not only the Spoke VRF, but also other VPNs on this device can’t be used.


Regarding to RR, it is the same principle: if RR can cope with such flooding, it need not send out RD-ORF to PE1. If RR can’t cope with, it must send out the RD-ORF message, or else not only the VPN that import RD X1 routes can’t work, but also other VPNs that don’t import RD x1 routes.


RD-ORF mechanism just keep the influences as small as possible.


Wish the above explanation can refresh your review of this draft.


We are also hopeful to invite you join us to make RD-ORF mechanism more robust and meet the critical challenges.

Aijun Wang

China Telecom

On Feb 12, 2021, at 19:30, Robert Raszuk <> wrote:

Aijun & Gyan,


Let me try one more (hopefully last time) to explain to both of you - and for that matter to anyone how supported this adoption. 


Let's consider very typical Hub and Spoke scenario as illustrated below: 





HQ1 is advertising two routes:


- one default with RDX1 with RT TO_SPOKE 

- one or more specifics with RDX1 to the other HUBs


Now imagine HQ1 bought a new BGP "Optimizer" and suddenly is starting to advertise 100000 /32 routes just to the other HUB with RT: TO_HUB. 






So PE2 detects this as VRF with RDX2 on it got overwhelmed during import with RT TO_HUB and starts pushing RDX1 (original RD) to RR to stop getting those routes. 


Well all great except now you are throwing baby with the water as all spokes attached to PE2 which just import default route to HUB HQ1 also can no longer reach their hub site as their default route will be removed. Therefor they will have nothing to import with RT:TO_SPOKE


Further if RR "independently" decided ... oh let's push this ORF to PE1 then all of the spokes attached to perhaps even much more powerful PE3 can also no longer reach their headquarters. 


- - - 




The above clearly illustrates why the proposed solution to use RD for filtering is in fact harmful. 


See when you design new protocol extensions the difficulty is to not break any existing protocols and deployments.


Hope this puts this long thread to rest now. 







 <> Image removed by sender.

Gyan Mishra

Network Solutions Architect 

M 301 502-1347
13101 Columbia Pike 
Silver Spring, MD