Re: [Idr] Adoption and IPR call for draft-wang-idr-vpn-prefix-orf-03.txt (8/16 to 8/30)

Jeffrey Haas <jhaas@pfrc.org> Thu, 01 September 2022 11:46 UTC

Return-Path: <jhaas@pfrc.org>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4B970C14CF13 for <idr@ietfa.amsl.com>; Thu, 1 Sep 2022 04:46:38 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.907
X-Spam-Level:
X-Spam-Status: No, score=-1.907 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 4ILb7gaHD2Ud for <idr@ietfa.amsl.com>; Thu, 1 Sep 2022 04:46:34 -0700 (PDT)
Received: from slice.pfrc.org (slice.pfrc.org [67.207.130.108]) by ietfa.amsl.com (Postfix) with ESMTP id E8EDBC14CF10 for <idr@ietf.org>; Thu, 1 Sep 2022 04:46:33 -0700 (PDT)
Received: from smtpclient.apple (99-59-193-67.lightspeed.livnmi.sbcglobal.net [99.59.193.67]) by slice.pfrc.org (Postfix) with ESMTPSA id 47E3D1E31E; Thu, 1 Sep 2022 07:46:32 -0400 (EDT)
Content-Type: multipart/alternative; boundary="Apple-Mail=_FC1BB9BA-DE4D-4218-8520-4E05426F83A1"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\))
From: Jeffrey Haas <jhaas@pfrc.org>
In-Reply-To: <CAOj+MMEqy=phzLzONcWvWUx_JUd=B9cnXS2mhGxFszdoYDVyrA@mail.gmail.com>
Date: Thu, 01 Sep 2022 07:46:31 -0400
Cc: Sue Hares <shares@ndzh.com>, idr <idr@ietf.org>
Message-Id: <6C7D1574-6E03-4B40-BCD4-01109687E936@pfrc.org>
References: <tencent_3C3279A3B4DAF8DA03F446E7AAE799D8AA09@qq.com> <CAEfhRrz5aAJmy2Ye1gqss2d72nm78n4SfeowO-FU7i4Z6Zpb+A@mail.gmail.com> <0CD78D4C-672F-41AA-8E1B-98CD8A875D21@pfrc.org> <CAEfhRrxkuYMmfcdX=M9PG2mN+D5fCBF5bVxd1bSA2O9PU5G-gA@mail.gmail.com> <000001d8bbba$ceb9e4b0$6c2dae10$@tsinghua.org.cn> <CAEfhRrwrKJ4A=QQBWRXtLKi-U0udv+zPuWoW0wqbeMQ2U-=JXA@mail.gmail.com> <CAOj+MMGLQ6enLxy36ZcFHh6qaCh7Ba1QFDa5XokccT7wvvU_fg@mail.gmail.com> <010101d8bc1c$da2391e0$8e6ab5a0$@tsinghua.org.cn> <CAOj+MMGuuzLWwMbfuMd-Lu4hZiY_9QUroE9k8fiFZ_uT65aHnw@mail.gmail.com> <CABNhwV1KYddV7htnp_ijPLTV11+4iot1+LET-3ey9FXf7zBNrg@mail.gmail.com> <BY3PR05MB80812C92380A7C25A46FA97AC7799@BY3PR05MB8081.namprd05.prod.outlook.com> <5364E604-6320-40BF-8E37-7D2497980EAC@pfrc.org> <CAOj+MMEqy=phzLzONcWvWUx_JUd=B9cnXS2mhGxFszdoYDVyrA@mail.gmail.com>
To: Robert Raszuk <robert@raszuk.net>
X-Mailer: Apple Mail (2.3696.120.41.1.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/VTwOrrkVKjNfqQryzK8u4vo6wkQ>
Subject: Re: [Idr] Adoption and IPR call for draft-wang-idr-vpn-prefix-orf-03.txt (8/16 to 8/30)
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 01 Sep 2022 11:46:38 -0000

Robert,

I don't support such text.

My point was that determining whether the overflow condition had been triggered can't be appropriately done at the route reflector.  My point was NOT that filtering couldn't be done at the reflector.

Your text effectively sums to "don't do ORF".  Or even don't do rt-constrain.

No.

I again refer you to the text from RFC 5291:
   Currently, it is not uncommon for a BGP speaker [BGP-4 <https://www.rfc-editor.org/rfc/rfc5291.html#ref-BGP-4>] to receive,
   and then filter out some unwanted routes from its peers based on its
   local routing policy.  Since the generation and transmission of
   routing updates by the sender, as well as the processing of routing
   updates by the receiver consume resources, it may be beneficial if
   the generation of such unwanted routing updates can be avoided in the
   first place.
Anything that a receiver might implement local policy to attempt to filter out is an appropriate candidate for an ORF.

The contortions you're going through to attempt to go from "I don't like this" to "this isn't reasonable to implement" are leading you to rather bad conclusions.  The feature is trivially implementable as long as it stays within the realm of "match this".  That said, ORFs were never guaranteed to be *efficiently* implementable because some policy operations can't be optimized a priori.  

With regard to negative operational impact, there already exist vendor features that deal with dropping excess routes in a VRF and thus have all of the "woe is me" problems you are wringing your hands over:
https://supportportal.juniper.net/s/article/How-to-individually-limit-IPv4-and-IPv6-routes-in-a-VPN-scenario?language=en_US <https://supportportal.juniper.net/s/article/How-to-individually-limit-IPv4-and-IPv6-routes-in-a-VPN-scenario?language=en_US>

https://www.cisco.com/c/en/us/td/docs/ios/mtr/command/reference/mtr_book/mtr_01.html <https://www.cisco.com/c/en/us/td/docs/ios/mtr/command/reference/mtr_book/mtr_01.html> - see "maximum routes"

https://infocenter.nokia.com/public/7750SR140R4/index.jsp?topic=%2Fcom.sr.l3%2Fhtml%2Fvprn.html&anchor=i8701838 <https://infocenter.nokia.com/public/7750SR140R4/index.jsp?topic=/com.sr.l3/html/vprn.html&anchor=i8701838>

It is already operational practice to generate an alarm when the soft and hard limits have been exceeded.  See documentation for each of the above.

When such limits are crossed, operators have to seek out the sources that are overflowing their VRFs and attempt to adjust the mixes that are contributing toward that device's VRF overflow.  Once adjusted upstream, the device has to recover missing state that may have been displaced by overflow using... wait a minute... route refresh.

So, what's changing in the proposed draft vs. what vendors have been shipping for ages?

Primarily detection mechanisms and targeted mitigation.  For their scenario, the overflow isn't fully general (and it might be), it may be because a given upstream contributor is at fault.  Semantically, it's an attempt at upstream inbound prefix-limit for a given CE connection - but it's far away.  Having determined that it's that upstream, the choice in this draft is to locally treat it as if that upstream prefix-limit has been exceeded and do the equivalent of shutting down that session.

The operational consequences of the proposal in the VRF are NO DIFFERENT than an upstream CE connection prefix limit dropping a session.

Could they mitigate this locally at the PE using policy?  Certainly.  EVERY SUCH POLICY IS A REASONABLE CANDIDATE FOR AN ORF.  As I noted in other responses, signaling the ORF won't be a panacea and still implies local filter installation until the ORF has effect, or deals with the fact that a given peer BGP Speaker doesn't support the feature.

As Igor emphasizes, does mitigating a specific RD deal with the overflow accounting problems in all circumstances?  No.  The dual-homed site is a good example where such accounting becomes problematic.  In such circumstances additional filters would need to be locally installed and signaled.  Discussing such impacts scenario-wise in the draft are reasonable.

Do I personally think I'd want to deploy this as a mitigation strategy?  For many of my customers, it wouldn't be reasonable or helpful.  But that's based upon operational choices for how you provision your VPN customers.  Very similar choices to how you assign route distinguishers in the first place, or how you deal with multihoming for such customers.

My own preference, likely shared by several operators in this thread, would be to provision inbound prefix-limits at the CEs to limit the number of routes a given PE-CE session could contribute to the VPN as a whole.  Even that becomes tricky if provisioning involves more than one CE connection into the VRF; operationally the VRF as the contributor may be the desired boundary for the limit and that becomes a per-RD limit.  Such per VRF limits are enforceable via features such as those shown above.

Such strategies work great if the network is under single administrative control.  In the absence of such single administrative control, local mitigation becomes something that requires consideration.

-- Jeff


> On Sep 1, 2022, at 4:33 AM, Robert Raszuk <robert@raszuk.net> wrote:
> 
> Dear Sue & Jeff,
> 
> I completely agree with you on the point below. 
> 
> Furthermore I would like to suggest the following text to be added to the shepherd's review: 
> 
> "The proposed functionality creates a set of filters after receiving and parsing BGP UPDATE messages. The document suggests that pushing such list of filters to upstream IBGP peer is a helpful and sound operation. 
> 
> However in practice BGP UPDATES to construct the filter have already been received, parsed, best path run and even import action (or its simulation) executed. Therefore such excessive routes can be dropped on the impacted PE locally without any need for upstream signalling. BGP does not resend full table periodically so only upon session reset or route refresh triggered dump the same NLRIs (with the same or possibly different paths) may arrive at the affected PEs. If paths are different then it is likely that previously sent filters in the form of <RD, RT list, NH> will not be effective. 
> 
> The VPN route local drops due to non-intersecting RTs is a very low cost operation and has been an integral part of BGP VPN deployments in all Provider Edge nodes since day one. Orders of magnitude more routes are being dropped on the receiving PEs then those which will be subject to action described as the hypothetical problem.  Such drop does not require running local best path nor import and can be highly optimised by local implementation. 
> 
> While dropping such excessive routes an alarm MUST be triggered to the operator to take action. 
> 
> Another advantage for local drops is that such a solution does not impact existing VPN connectivity. While the subject document does. Imagine the situation presented by one of the WG members where an existing VPN with 1000 routes works correctly on the receiving PE. For simplicity assume that those 1000 routes come from single hub src PE's VRF with one RT. 
> 
> So what the draft proposed when receiving even 10 excessive routes is to construct the filter <RD, RT, NH of src PE> and send it to Route Reflector. The effect of such action would be the withdrawal of all 1010 routes ! That's a harmful and not helpful model. Instead, locally receiving PE can drop those 10 routes and notify NOC."
> 
> Kind regards,
> Robert Raszuk
> 
> 
> 
> 
> On Thu, Sep 1, 2022 at 2:44 AM Jeffrey Haas <jhaas@pfrc.org <mailto:jhaas@pfrc.org>> wrote:
> 
> 
>> On Aug 30, 2022, at 10:12 AM, John E Drake <jdrake=40juniper.net@dmarc.ietf.org <mailto:jdrake=40juniper.net@dmarc.ietf.org>> wrote:
>> I think Robert’s approach is a much better solution to the problem and it obviates the need for the subject draft.
> 
> The approach of running the procedures on the route reflector are problematic for the original scenario in multiple respects:
> - The point of measurement for the quota is the receiving VRF.
> - The reflector would have to proxy all VRF receive policies to see if the route should be applied vs. the quota.
> 
> Never mind that many providers prefer to avoid any specific provisioning of VPN behaviors on route reflectors.
> 
> -- Jeff
> 
>