Re: [Idr] Adoption and IPR call for draft-wang-idr-vpn-prefix-orf-03.txt (8/16 to 8/30)

Igor Malyushkin <gmalyushkin@gmail.com> Mon, 29 August 2022 20:23 UTC

Return-Path: <gmalyushkin@gmail.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 84462C152565 for <idr@ietfa.amsl.com>; Mon, 29 Aug 2022 13:23:03 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.094
X-Spam-Level:
X-Spam-Status: No, score=-7.094 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_REMOTE_IMAGE=0.01, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id dMwwVSvyDDMJ for <idr@ietfa.amsl.com>; Mon, 29 Aug 2022 13:22:59 -0700 (PDT)
Received: from mail-io1-xd2d.google.com (mail-io1-xd2d.google.com [IPv6:2607:f8b0:4864:20::d2d]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9F8CAC152564 for <idr@ietf.org>; Mon, 29 Aug 2022 13:22:59 -0700 (PDT)
Received: by mail-io1-xd2d.google.com with SMTP id y187so7587868iof.0 for <idr@ietf.org>; Mon, 29 Aug 2022 13:22:59 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc; bh=mONHk1S0G8+/+xUWDAGjOqZ/M4FJzSNzUHA5oimjgck=; b=IqP+vJn5MNLtNgxN98ig9Ceh21zwKbzhI6RwxI0xZ9enD1+4HTKxu5CX0dZSzmz5vh 3QG1BK+G+oq5fM3COZfhHysx85mNdp8Yqrax02Jc2HvwrMQnNs5fm+QHoSePtqVy9MBo NbSd3kp9/XNQuhvmEykg28PcjRrvCAymFeerGNczbukdWNNIajiI1kKr3ngbLnV4q4Hd +TRtsBcunx4Wm3KzAFZTSxijHwEMXzniaQiyMfENXlje+t9ev8ClsemhWviGUhMRWK20 0Slupqi7VXy/yJ8MMFan41a8mOgW3BrBZ85IFaf0C/6jhR0SlOPPo5HtR968GNoTmdUx IGug==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc; bh=mONHk1S0G8+/+xUWDAGjOqZ/M4FJzSNzUHA5oimjgck=; b=nudR5rAn4tXIWPleH8yHq8EAa66ruvRGWb4VJKW2cQ9C2pFxOTPYb4NUS5IFOt0FEw XELRqnvRVgy05lsZEJ2uOSvgprNunnAdIOU5zPUzi5xmlGfbvkkYXJFLWBAYyqkILOye 5aUAtNo8GbUBcI7iupz6RdTbyJW7g8yYuSLvkuXmCKcWUFHOAO2pVRFxrUvCRYTOxuBD JmfcsZPPvLa0W7O4NxZ0O+ICB0kCajIp5U+egseKaqKKVGnc8HLkkyorz9PPwe4SuDI6 kuOV4Y5p7z9DKQl0JlcnrTtcxC2ysMuRPWpfWHn8gCqTcLn2tcqMS9e4w/I8duxzyRCe HmTQ==
X-Gm-Message-State: ACgBeo0NuywSbwQFjjAdVojhSxG5juZJ/p3n75ihbYl/u345iLzfzTlC jTqRKlDyhiggN79WR7AP0Yh4azHJaZ89xYNNLlY=
X-Google-Smtp-Source: AA6agR4476izMJ1OiAUjrL0fixUvTuX3B1i71UOFCdctICABhzaFRQ70yjUJ1X5nNvLvs9asSSHy3WrHSUAHUKzmaRw=
X-Received: by 2002:a6b:e816:0:b0:688:c999:d08c with SMTP id f22-20020a6be816000000b00688c999d08cmr9882816ioh.100.1661804578377; Mon, 29 Aug 2022 13:22:58 -0700 (PDT)
MIME-Version: 1.0
References: <tencent_3C3279A3B4DAF8DA03F446E7AAE799D8AA09@qq.com> <CAEfhRrz5aAJmy2Ye1gqss2d72nm78n4SfeowO-FU7i4Z6Zpb+A@mail.gmail.com> <0CD78D4C-672F-41AA-8E1B-98CD8A875D21@pfrc.org> <CAEfhRrxkuYMmfcdX=M9PG2mN+D5fCBF5bVxd1bSA2O9PU5G-gA@mail.gmail.com> <000001d8bbba$ceb9e4b0$6c2dae10$@tsinghua.org.cn> <CAEfhRrwrKJ4A=QQBWRXtLKi-U0udv+zPuWoW0wqbeMQ2U-=JXA@mail.gmail.com> <CABNhwV3=-rXCEsM1NJXt=ktQwAryBayZGjGbSqASEZ1ywomb8w@mail.gmail.com> <CAEfhRrxcfqr-WvW4ujtXhh8ToMjEBAtTqKMgULNUtdS7Xi3FfQ@mail.gmail.com> <CABNhwV02myvd_NBC6J9JFNuAURwhJ3o=JfE4=5G_az5N=WVJHQ@mail.gmail.com>
In-Reply-To: <CABNhwV02myvd_NBC6J9JFNuAURwhJ3o=JfE4=5G_az5N=WVJHQ@mail.gmail.com>
From: Igor Malyushkin <gmalyushkin@gmail.com>
Date: Mon, 29 Aug 2022 22:22:45 +0200
Message-ID: <CAEfhRrx1e-jZ=jNoBnNiMY7+4sV8DkRW7eORH_3ZWt+5ax4T4A@mail.gmail.com>
To: Gyan Mishra <hayabusagsm@gmail.com>
Cc: Aijun Wang <wangaijun@tsinghua.org.cn>, Sue Hares <shares@ndzh.com>, idr <idr@ietf.org>
Content-Type: multipart/alternative; boundary="00000000000094944405e7670791"
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/6cuMhrDXtzoqw1o4Ce6nlq5FG_k>
Subject: Re: [Idr] Adoption and IPR call for draft-wang-idr-vpn-prefix-orf-03.txt (8/16 to 8/30)
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 29 Aug 2022 20:23:03 -0000

Hi Gyan,

Please see the inline.

пн, 29 авг. 2022 г. в 21:46, Gyan Mishra <hayabusagsm@gmail.com>:

> Hi Igor
>
> On Mon, Aug 29, 2022 at 2:38 PM Igor Malyushkin <gmalyushkin@gmail.com>
> wrote:
>
>> Hi Gyan,
>>
>> I`m not talking about a case when two CEs are connected to the same PE
>> and they advertise the same prefix. Instead, imagine a scenario when a CE
>> or group of CEs (say, sub-ring) is connected to two PEs. These PEs use
>> different RDs for a VRF where the CE (or the sub-ring) resides. For some
>> reason, PE-CE session limits aren`t configured and both PEs received a lot
>> of routes from the CE. One PE may receive these routes slightly later
>> (imagine, there is some propagation delay of routes through the sub-ring).
>> Or one of the multihomed PEs is slightly saturated by CPU resources. Or
>> there is misconfiguration for MRAI, etc., etc., etc. Eventually, one of the
>> PEs sends VPN routes via internal sessions toward an RR with some delay.
>>
>
>> A destination PE starts receiving VPN routes from the first (a faster) PE
>> via a session to the RR, these routes exhaust a quota and a VRF prefix
>> limit. The destination PE sends an ORF message to the RR and starts
>> discarding excessive routes that it already received, but it is still
>> receiving new routes from the RR (RR hasn`t received and processed the ORF
>> message).
>>
>
>     Gyan> Clarification.  The destination PE sends ORF message to the RR,
> RR sends updated Adj-RiB-Out based on ORF entires, destination PE now
> receives the “filtered” routing update based on the ORF entries processed
> by the RR.
>
[IM] There is plenty of time between these steps. Please consider that all
of this is happening in a very short period of time. For example, when the
destination PE decides to send the ORF message it is still receiving routes
from the RR. These routes are already excessive and they will be dropped
from inserting to a VRF locally by the VRF prefix limit. Also, "the
filtering routing update" is not an entity. It is a lot of routes that can
be packed and has to be sent as some amount of UPDATE messages. The
destination PE has to process them too. It cannot be described just as
"send, receive, now". It is a continuous process.

>
> The RR is doing the discarding or dropping/filtering towards the PE and
> it’s the PE as a result recovering from the high CPU and memory exhaustion
> with relief on the VRF RIB.
>
[IM] I haven`t said anything about the high CPU or memory of the
destination PE. Crossing the VRF limit does not mean that there are some
problems with the resources. One PE can be able to process excessive
routes, another can not be able. This situation can change at different
moments of time and depends on variable parameters.

>
> I am not understanding this in quotes
> “ but it is still receiving new routes from the RR (RR hasn`t received and
> processed the ORF message).”
>
> If new routes are from the second multihomed PE those routes would also be
> dropped with RT TLV set for each source PE flooding the routes
>
[IM] I explained this above. First, at this moment in time, there is
nothing about the routes from the multihoming second PE. Second, there is a
time frame between the reaction of the destination PE on the excessive
routes and the decision of the RR to stop sending UPDATES due to the ORF
message. All of this time the RR can send routes to the destination PE if
the RR has them to send. I don`t see anything bad here actually. It's just
how things work.

>
> The destination PE is only discarding routes based on the updated
> Adj-RIB-out from the RR
>
[IM] Yes. By discarding, I mean not installing them into the VRF due to the
VRF prefix limit. Maybe the term is not suitable, but hope now it`s clear.

>
> At this time the RR starts sending also VPN routes from the second
>> multihoming PE. It also eventually receives the ORF message and stops
>> sending routes from the first PE.
>>
>
>    Please read RFC 5291 which explains the ORF process.  Basically ORF
> AFI/SAFI is first negotiated P2P and then the Peer A (PE) sends ORF towards
> Peer B (RR) and peer B installs the ORF entries from peer A and updates
> it’s Adj-RIB-Out towards Peer A at which point the routes based on the ORF
> entries received have been dropped or excluded in the update to Peer A
> based on the 3 tuple {RT, RD, Source PE}.  So the action of dropping is
> done by side receiving the ORF in this case is the RR.
>
[IM] Thanks for the reference. Actually here you express my concern that RR
will delete ALL routes but not only excessive ones. In the other way, it
needs something more than just a tuple of {RT, RD, SRC}. But anyway, your
statement does not change anything. RR will reevaluate the Adj-RIB-Out
toward the destination PE and will drop or exclude (using your terminology)
routes based on {RT, RD, SRC}, but please note that routes from the second
multihoming PE have different RD. Thus, RR will proceed to send them until
it receives the second ORF message. After that, RR again will reevaluate
the Adj-RIB-Out and will drop or exclude routes, but now with the {RT,
RD-2, SRC-2}. The destination PE will again process some amount UPDATEs
with withdrawals.

>From the point of view of destinations (not routes), this process will be
repeated two times for every destination. For example, route A from the
first multihoming PE will be the route above the quota but below the VRF
prefix limit. Thus, it will be installed into the VRF. Then RR will
withdraw this route as excessive. The destination PE will delete this route
from the VRF and will install a route for the same destination from the
second multihoming PE (if the destination PE has received this route, it
can actually happen). This process will be repeated when RR receives the
second ORF message.

RR starts sending withdrawals for the routes of the first PE and continues
>> sending routes of the second PE. Let`s imagine, that the destination PE
>> considers the routes of the second multihoming PE and always compares them
>> with the quota (I`m still not sure about it, the draft is uncertain here).
>> Due to the VRF prefix limit being passed a long time ago, the PE sends the
>> second ORF message (although we could stop all this nightmare with the
>> first message if it weresource-less). All this time the destination PE is
>> dropping the same amount of routes but from the second multihoming PE. The
>> RR received the second ORF, stops sending updates, and start sending
>> withdrawals.
>> Consider that some routes would be deleted from the VRF (I`m still not
>> sure about it) when the destination PE sends the first ORF message. In this
>> case, we also need to update FIB, delete the routes from the first
>> multihoming PE, then install routes for the same destinations from the
>> second. After the second ORF message, we again delete these routes.
>>
>>
>> пн, 29 авг. 2022 г. в 20:09, Gyan Mishra <hayabusagsm@gmail.com>:
>>
>>>
>>> Hi Igor
>>>
>>> In the dual homes CE scenario the paths advertised from CE1 would have
>>> path id 1 and the prefixes from CE2 would have a different path id homed to
>>> the same PE, and if add paths is enabled on all PEs for diverse pathing the
>>> redundant path may also have a different path id as the withdrawal is done
>>> based on the path id.  So I don’t see the withdrawal causing any kind of
>>> race conditions.
>>>
>>> Kind Regards
>>>
>>> Gyan
>>> On Mon, Aug 29, 2022 at 12:09 PM Igor Malyushkin <gmalyushkin@gmail.com>
>>> wrote:
>>>
>>>> Hi Aijun,
>>>>
>>>> We can see the solution to the problem differently, but I think any
>>>> solution must not create additional problems.
>>>>
>>>> I`m not sure that with possible race conditions this solution
>>>> doesn`t pose new problems with the processing of updates.
>>>>
>>>>
>>>> пн, 29 авг. 2022 г. в 17:20, Aijun Wang <wangaijun@tsinghua.org.cn>:
>>>>
>>>>> Hi, Igor:
>>>>>
>>>>>
>>>>>
>>>>> The quota value shouldn't be changed dynamically.
>>>>>
>>>> [IM] Ok, it was bad wording. I mean to count received routes over a
>>>> quota even if the VRF prefix limit is reached.
>>>>
>>>>>
>>>>>
>>>>> In your mentioned scenario(CE is dual homed to two PEs), normally the
>>>>> routes from the first PE and second PE will pass their quotas at the same
>>>>> time first.
>>>>>
>>>> [IM] What do you mean by "normally"? We *expect *that they will be
>>>> received by a destination PE almost at the same time, but it is not
>>>> guaranteed.
>>>>
>>>>> Then when the VRF limit is reached, both of them will be withdrawn via
>>>>> the VPN Prefixes ORF message at the same time.
>>>>>
>>>> [IM] This statement is based on a previous invalid assumption.
>>>>
>>>>>
>>>>>
>>>>> Then is it rare or impossible that your mentioned scenario will occur?
>>>>>
>>>> [IM] I don`t think that multihoming of CE is rare, also I don`t think
>>>> that multihoming PEs will send updates at the same time at the same pace
>>>> (lots of reasons for that).
>>>>
>>>> Aijun Wang
>>>>>
>>>>> China Telecom
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> *发件人:* idr-bounces@ietf.org [mailto:idr-bounces@ietf.org] *代表 *Igor
>>>>> Malyushkin
>>>>> *发送时间:* 2022年8月29日 21:27
>>>>> *收件人:* Jeffrey Haas <jhaas@pfrc.org>
>>>>> *抄送:* idr <idr@ietf.org>; Sue Hares <shares@ndzh.com>
>>>>> *主题:* Re: [Idr] Adoption and IPR call for
>>>>> draft-wang-idr-vpn-prefix-orf-03.txt (8/16 to 8/30)
>>>>>
>>>>>
>>>>>
>>>>> Hi Jeff,
>>>>>
>>>>> Thanks for comments.
>>>>>
>>>>>
>>>>>
>>>>> I`m concerned that the suggested solution covers only subset of cases.
>>>>> For example, if a multihomed CE sends us lots of prefixes (that we for
>>>>> unknown reason didn`t drop at ingress), one multihomed PE can distribute
>>>>> them slightly faster than another one. In that case, routes from one
>>>>> multihoming PE will deplet and its quota, and the VRF prefix limit. At the
>>>>> same time routes from the second multihoming PE come. Let`s imagine that RR
>>>>> hasn`t withdrew yet all excessive routes of the first multihoming PE, it is
>>>>> in the process. Here we need to drop locally (due to the old-good prefix
>>>>> limit) almost the same amount of routes (roughly) from the second leg also
>>>>> receive and process withdraws from RR for the fist leg. I believe we will
>>>>> make things with resources even worse. Not to mention if we will free some
>>>>> room for prefixes due to ORF, we will doomed to update RIB/FIB two times in
>>>>> vain.
>>>>>
>>>>>
>>>>>
>>>>> Maybe it`s a good move to count a quote independently of the VRF limit
>>>>> (such mechanic isn`t described in the draft, so I`m not sure how
>>>>> it actually works). In the scenario above despite we locally drop excessive
>>>>> routes from the second multihoming PE due to the VRF prefix limit, we can
>>>>> also reduce its quota at the same time and react much faster.
>>>>>
>>>>>
>>>>>
>>>>> Please also see the inline.
>>>>>
>>>>>
>>>>>
>>>>> пн, 29 авг. 2022 г. в 14:45, Jeffrey Haas <jhaas@pfrc.org>:
>>>>>
>>>>> Igor,
>>>>>
>>>>> > On Aug 29, 2022, at 8:39 AM, Igor Malyushkin <gmalyushkin@gmail.com>
>>>>> wrote:
>>>>> >
>>>>> >
>>>>> > In the first option, will RR withdraw all PE3`s routes until the
>>>>> number of these routes reaches to the quota of PE3, right? In such way, the
>>>>> described problem can happen only in the second scenario because there will
>>>>> be a room for the routes of PE2. If RR withdraws routes that overflowed the
>>>>> VRF prefix limit only, the described problem will actual for any case.
>>>>>
>>>>> One observation is that the local systems, when examining their
>>>>> quotas, can use the fact that it knows that a given RD is intended to be
>>>>> mitigated by the ORF or not.
>>>>>
>>>>> Exactly how the system needs to behave in the implementation would
>>>>> partially depend on the reason for mitigation.  For memory exhaustion, it
>>>>> may need to be more aggressive about discarding routes.  For CPU overload,
>>>>> lesser mitigations may be sufficient.
>>>>>
>>>>> [IM] Actually overloading of a VRF prefix limit (which starts sending
>>>>> of an ORF message) does not mean that there are any problems with the
>>>>> memory or CPU. It is just a threshold, a device can even locally drop all
>>>>> excessive routes without any starvation of its resources. This threshold
>>>>> (VRF limit) is an only good and reliable trigger for us. We also can`t know
>>>>> beforehand what problem is actual in the case of routes overloading, it may
>>>>> be either a memory problem, or a CPU one, or even both. So I can`t see a
>>>>> way to configure "the aggressiveness mode" for the proposed solution
>>>>> either. Or I didn`t get your point.
>>>>>
>>>>>
>>>>> I think the critical implementation detail is that once this ORF is
>>>>> triggered, it should require operator intervention to clear to avoid
>>>>> thrashing routes.
>>>>>
>>>>> [IM] Operator`s intervention should be triggered earlier, when the
>>>>> quota has passed. But I agree that number of excessive routes can be so
>>>>> much so it will run through the quota and the VRF limit almost
>>>>> simultaneously.
>>>>>
>>>>>
>>>>> -- Jeff
>>>>>
>>>>> _______________________________________________
>>>> Idr mailing list
>>>> Idr@ietf.org
>>>> https://www.ietf.org/mailman/listinfo/idr
>>>>
>>> --
>>>
>>> <http://www.verizon.com/>
>>>
>>> *Gyan Mishra*
>>>
>>> *Network Solutions A**rchitect *
>>>
>>> *Email gyan.s.mishra@verizon.com <gyan.s.mishra@verizon.com>*
>>>
>>>
>>>
>>> *M 301 502-1347*
>>>
>>> --
>
> <http://www.verizon.com/>
>
> *Gyan Mishra*
>
> *Network Solutions A**rchitect *
>
> *Email gyan.s.mishra@verizon.com <gyan.s.mishra@verizon.com>*
>
>
>
> *M 301 502-1347*
>
>