Re: [Idr] Adoption and IPR call for draft-wang-idr-vpn-prefix-orf-03.txt (8/16 to 8/30)

Jeffrey Haas <jhaas@pfrc.org> Thu, 01 September 2022 01:03 UTC

Return-Path: <jhaas@pfrc.org>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 57D8DC1524D7 for <idr@ietfa.amsl.com>; Wed, 31 Aug 2022 18:03:39 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.91
X-Spam-Level:
X-Spam-Status: No, score=-6.91 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id FwcXzhFTgx8a for <idr@ietfa.amsl.com>; Wed, 31 Aug 2022 18:03:35 -0700 (PDT)
Received: from slice.pfrc.org (slice.pfrc.org [67.207.130.108]) by ietfa.amsl.com (Postfix) with ESMTP id E6655C14CE2B for <idr@ietf.org>; Wed, 31 Aug 2022 18:03:34 -0700 (PDT)
Received: from smtpclient.apple (99-59-193-67.lightspeed.livnmi.sbcglobal.net [99.59.193.67]) by slice.pfrc.org (Postfix) with ESMTPSA id 59CA81E31E; Wed, 31 Aug 2022 21:03:34 -0400 (EDT)
Content-Type: text/plain; charset="us-ascii"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3696.120.41.1.1\))
From: Jeffrey Haas <jhaas@pfrc.org>
In-Reply-To: <CAEfhRrxcfqr-WvW4ujtXhh8ToMjEBAtTqKMgULNUtdS7Xi3FfQ@mail.gmail.com>
Date: Wed, 31 Aug 2022 21:03:33 -0400
Cc: Gyan Mishra <hayabusagsm@gmail.com>, idr <idr@ietf.org>, Sue Hares <shares@ndzh.com>
Content-Transfer-Encoding: quoted-printable
Message-Id: <D80520F0-A964-45DC-ACCD-050B0514516E@pfrc.org>
References: <tencent_3C3279A3B4DAF8DA03F446E7AAE799D8AA09@qq.com> <CAEfhRrz5aAJmy2Ye1gqss2d72nm78n4SfeowO-FU7i4Z6Zpb+A@mail.gmail.com> <0CD78D4C-672F-41AA-8E1B-98CD8A875D21@pfrc.org> <CAEfhRrxkuYMmfcdX=M9PG2mN+D5fCBF5bVxd1bSA2O9PU5G-gA@mail.gmail.com> <000001d8bbba$ceb9e4b0$6c2dae10$@tsinghua.org.cn> <CAEfhRrwrKJ4A=QQBWRXtLKi-U0udv+zPuWoW0wqbeMQ2U-=JXA@mail.gmail.com> <CABNhwV3=-rXCEsM1NJXt=ktQwAryBayZGjGbSqASEZ1ywomb8w@mail.gmail.com> <CAEfhRrxcfqr-WvW4ujtXhh8ToMjEBAtTqKMgULNUtdS7Xi3FfQ@mail.gmail.com>
To: Igor Malyushkin <gmalyushkin@gmail.com>
X-Mailer: Apple Mail (2.3696.120.41.1.1)
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/SQXD1H9nTVO1ix5EBRoJnXB8WVs>
Subject: Re: [Idr] Adoption and IPR call for draft-wang-idr-vpn-prefix-orf-03.txt (8/16 to 8/30)
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 01 Sep 2022 01:03:39 -0000

Igor,

I think the response below covers the followup question you had to me as well.  If it hasn't, please let me know and I'll respond to that one separately.

> On Aug 29, 2022, at 2:37 PM, Igor Malyushkin <gmalyushkin@gmail.com> wrote:
> A destination PE starts receiving VPN routes from the first (a faster) PE via a session to the RR, these routes exhaust a quota and a VRF prefix limit. The destination PE sends an ORF message to the RR and starts discarding excessive routes that it already received, but it is still receiving new routes from the RR (RR hasn`t received and processed the ORF message). At this time the RR starts sending also VPN routes from the second multihoming PE. It also eventually receives the ORF message and stops sending routes from the first PE. RR starts sending withdrawals for the routes of the first PE and continues sending routes of the second PE. Let`s imagine, that the destination PE considers the routes of the second multihoming PE and always compares them with the quota (I`m still not sure about it, the draft is uncertain here). Due to the VRF prefix limit being passed a long time ago, the PE sends the second ORF message (although we could stop all this nightmare with the first message if it weresource-less). All this time the destination PE is dropping the same amount of routes but from the second multihoming PE. The RR received the second ORF, stops sending updates, and start sending withdrawals. 
> Consider that some routes would be deleted from the VRF (I`m still not sure about it) when the destination PE sends the first ORF message. In this case, we also need to update FIB, delete the routes from the first multihoming PE, then install routes for the same destinations from the second. After the second ORF message, we again delete these routes.

You're covering several points here:
1. When an ORF has been sent to an upstream peer, it may not take immediate effect.  This is a correct observation.  Matter of fact, once it takes effect, it will start sending BGP Withdraw messages for things it has already sent and cause some amount of CPU and message impact.  This was commented upon in the first round adoption discussion.

2. What do you do about the counters while you're waiting on cleanup for the first ORF to be processed?  I think there's a few additional points here:
2a. While there's no requirement for the ORF sender to also have local policy that covers the ORF's behavior, it's a reasonable thing to have.  (And covers the original ORF use cases where local policy is configured, but requested to be enforced remotely.)  This means that while waiting for BGP messages from the ORF receiver to try to delete the routes at the sender, the sender is also likely re-running its policy on its rib-in and locally deleting things.  That said, this is probably not instantaneous.
2b. Since cleanup is not instantaneous, but end intent is understood that cleanup will happen, the implementation could attempt to judge its quota with the realization that some routes that aren't yet gone will be gone.  However, if the mechanism is invoked to try to limit memory impact rather than CPU impact, it may not happen fast enough and may trigger an additional cycle of cleanup.

2b is a churn behavior that ideally could use further discussion in the vpn-orf Internet-Draft.

It's also important to note that deployments may not fully deploy the ORF, so some cleanup procedures would still need to be local.

-- Jeff