Re: [Idr] WG Adoption call for draft-wang-idr-rd-orf-05.txt (2/4/2021 to 2/18/2021)

Gyan Mishra <hayabusagsm@gmail.com> Tue, 16 February 2021 21:01 UTC

Return-Path: <hayabusagsm@gmail.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 71D113A1127 for <idr@ietfa.amsl.com>; Tue, 16 Feb 2021 13:01:00 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.097
X-Spam-Level:
X-Spam-Status: No, score=-2.097 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OrtlzMz4-_6O for <idr@ietfa.amsl.com>; Tue, 16 Feb 2021 13:00:55 -0800 (PST)
Received: from mail-pj1-x102b.google.com (mail-pj1-x102b.google.com [IPv6:2607:f8b0:4864:20::102b]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8DC713A10FA for <idr@ietf.org>; Tue, 16 Feb 2021 12:59:48 -0800 (PST)
Received: by mail-pj1-x102b.google.com with SMTP id c19so78183pjq.3 for <idr@ietf.org>; Tue, 16 Feb 2021 12:59:48 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=llwpjqhzH3Zw9hW3n4V1Znu+gzMgJbc9MkNJFfe6DxI=; b=MTQbRKgfMHuC4aq2XgadtgMQ4UiGXJMiXRRiw87Jsd0IqBNxlU2Ktcp0hnMAO+hpa+ upUUWD30MQKdM0AEiW1bv7CXT1PYXi7DFWO3+yd2nwcRRQy4DqfiC5W2qqzeaoBlQzBy gX20Xw5xWWuXWEQsw/q4VfJ1/DEQl1piHcUh+EbdL9cV2V46PeqHC2rV8EB/Yi9rWR5V sESdrFq7/podf03iwpPSPYAInTk3litOetgAY7znwDhL/mj2o9wOpCiW5igMaub2s7Um 0in7jn9+yxZFrjFcMcJrdDbOFEiXxZhZmQp7xC3YB5kdIp30FMoRGy1FEaAHi1rlyOgZ gXDw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=llwpjqhzH3Zw9hW3n4V1Znu+gzMgJbc9MkNJFfe6DxI=; b=jToz3v1oT7FMGFTUpCfyYOrmapICtBlFd8/a+nAmjwxZzF9uPayBly8TY1TLm3YCmq 3bfozPEP3WGz78fUPWZdr3IuQwXr8/Q/fwYV9Xbo3g+isjiEgHBGQf25chi3hX0bjXHO z6rb2Hwzagiw9Yvr+MwRdynId8HjueJp3ODMXQzJpf1Zi3L3Iany/pM8LzJ9Q7s1Xc23 Ywy0u7pYInCYF8v99PzQ0V1i3lNYeAg7O/XS8P+bi5ad7kH0Wt3glifmg/4wkEhWyl1i hJJAiGcJmlctishaJfTJvkrrGSuK8AEkPecnAgf2dtRsoruOU/oDih0LlSK4kWIzA/Xn EJIQ==
X-Gm-Message-State: AOAM531IM7beXO4EAW4vfBkqcDZUtX7GgSlCPECrV7hD/WKNTF7Y/ypg rxCIOd4lPEX4UNeNCatzSO/V8J9CjZQ1ULxQlxY=
X-Google-Smtp-Source: ABdhPJzAIHtxkXmDShwqV+Vx06N3uT6G23+ML1mO9YGLAok8gaawu/qrHpXkm0o/JDtZI4b/pP8wzz9n/ksUEyXTwdc=
X-Received: by 2002:a17:90a:3f8d:: with SMTP id m13mr5386312pjc.215.1613509187839; Tue, 16 Feb 2021 12:59:47 -0800 (PST)
MIME-Version: 1.0
References: <CAOj+MMEviLf-1Ay2NUkNUx_bzDt+cyFZV61rjuKh2crZFjCJ3g@mail.gmail.com> <F9BFBCF7-4985-4F45-9A0A-EB46DB7F9FCB@tsinghua.org.cn> <CABNhwV1RC1rRQz9r7zxMZaGkM=r34JGi1QihvoG0STBvzkwBRw@mail.gmail.com> <CAOj+MMEbp3JtC9Mh4Uf6g2VPFbCmkrEuXQdd8C70J88XesTOUg@mail.gmail.com>
In-Reply-To: <CAOj+MMEbp3JtC9Mh4Uf6g2VPFbCmkrEuXQdd8C70J88XesTOUg@mail.gmail.com>
From: Gyan Mishra <hayabusagsm@gmail.com>
Date: Tue, 16 Feb 2021 15:59:20 -0500
Message-ID: <CABNhwV3eff+h==Bzi1tRkiZDFri5TQWnQUFvzHbY9eLmMfoUag@mail.gmail.com>
To: Robert Raszuk <robert@raszuk.net>
Cc: Aijun Wang <wangaijun@tsinghua.org.cn>, "Jakob Heitz (jheitz)" <jheitz@cisco.com>, Susan Hares <shares@ndzh.com>, "idr@ietf. org" <idr@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000fbac3605bb7a6161"
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/c56feleNj5vdBwPtzQ-R-I5o1s0>
Subject: Re: [Idr] WG Adoption call for draft-wang-idr-rd-orf-05.txt (2/4/2021 to 2/18/2021)
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 16 Feb 2021 21:01:02 -0000

Hi Robert

Responses In-line.  “Operators perspective”

Thanks

Gyan

On Sat, Feb 13, 2021 at 6:42 AM Robert Raszuk <robert@raszuk.net> wrote:

> All,
>
> >   The problem we are trying to solve is a scenario where you have an
> offending PE that is flooding routes and a weak PE that is overwhelmed by a
> flood of routes.
>

   Gyan> Correct

>
> The problem is a valid problem. But the proposed solution to the problem
> is not. And moreover solutions to this very problem are widely known and
> used for years.
>
> *The problem as a matter of fact has nothing to do with VPNs of any sort.*
>
>
> If you are ISP offering Internet transport unless you apply proper
> protection you would be badly exposed. Your clients or peers or upstreams
> injecting millions of routes and melting your network and perhaps even
> global Internet (if they would use a registered block).
>

    Gyan>  This issue with a weak PE has many variables and permutations
that could exist.  As their are always upgrade cycles the existence of
devices on older code or hardware.  The other variable is if the PE has
multiple services such as  a mix of L2 and L3 for example L3 VPN along with
L2 VPLS or EVPN type 2 Mac route flood or any other services that running
in parallel on the PE could exacerbate the issue.

>
> And yes BGP ingress policy here is used to filter out junk before it
> enters any network. The same policy must be used in VPN cases too.
>
> Indeed years have passed and I think I have only seen in a very few cases
> that operators offering L3VPNs are doing prefix ingress filtering ... Max
> prefix on ingress is used much more often. Those are the right tools here
> to work on. I am not saying we should not invent more ... we should.
>

   Gyan> As far as existing mitigation techniques, the problem is that
although they are pro active and not reactive to this particular flood
issue, they don’t solve the problem.  From an operator standpoint knobs
used today as you mentioned Ingres filtering is not user other then a
generic bit wise prefix list to permit a any  “ge le”.  Both PE-CE maximum
prefix and VPN maximum prefix are both set to a very high water mark
statistical multiplexing to protect from a flood on a single or few VRFs
but not a simultaneous flood on all VRFs as resources would be exhausted
even on a newer hardware strong PE.

>
> Ideas:
>
> * Customise secure BGP to work in VPN cases as example.
>
> * Augment RRs to be a bit more intelligent with pitch of ML - if number of
> routes with given RD for time Tx is R moment you receive 100*R you suspend
> those and raise NOC alarm before spraying everywhere
>

Gyan> Sounds similar to BGP dampening of routes concept. This would require
maybe a new SAFI or codepoint , also the concept of suspend does not exist
in BGP so would require Major updates to BGP protocol itself which to me is
a more risky and much more intrusive a change. The nice thing about RD-ORF
is the existing machinery already exists with ORF and BGP route refresh.
Also that the ORF ask-rib-out filter  can be manually removed without
impact from flaps or any instability in the network.

>
> * Do not put VPN customer routes into your data plane ... just handle next
> hops. Redefine RFC4364 all together and use IP transport for it.
>

   Gyan> The major reason why RD-ORF from an operators perspective as far
as risk is acceptable is that it is not a MAJOR rewrite or change to
existing BGP protocol as the others mentioned above being MAJOR code hacks
that can be very risky chance of bug from code collateral damage fixing one
thing could break something else.  SAFI 128 does not have separation of
control plane and data plane as does SAFI 129 MVPN or EVPN.  We have seen
that many times on the past with new features breaking existing features or
the protocol itself.  The beauty behind RD-ORF is that it used existing ORF
spec RFC 5291 and route refresh RFC 7313 so very minor change required.

>
> etc ...
>
> Focus on not allowing the meltdown is the proper solution space ... But
> here instead we do nothing to prevent the fire to start and instead focus
> on tools to extinguish it. Wrong approach. And specifically this tool
> (RD-ORF) does way too much damage when used.
>

    Gyan> What you are proposing as the above ideas are major code hacks to
the BGP protocol to prevent  the fire pro actively.  Thus the attention
with this draft on extinguish the fire.  I think with the extinguish
concept reactive trigger to a problem as this uses existing BGP machinery
and very minor code change to implement it is the least risky to solve the
problem.  I think in the future as maybe a holistic future approach looking
at this issue and any other route flood type issues,  I am all for looking
at a possible solution in the future, but I think that should be
independent draft project and not hold up this solution.

>
> Thx,
> R.
>
>
>
>
>
>
> On Sat, Feb 13, 2021 at 4:16 AM Gyan Mishra <hayabusagsm@gmail.com> wrote:
>
>>
>> All
>>
>> From Susan Hares summary of where we are at with the adoption call let’s
>> start with the problem this draft is trying to solve and gaining
>> consensus.  Once we gain consensus we can get back to RD-ORF solution.  See
>> w
>>
>> a) the problem this draft is drafting to solve relating to BGP routes,
>>
>> The problem we are trying to solve is a scenario where you have an
>> offending PE that is flooding routes and a weak PE that is overwhelmed by a
>> flood of routes.  This is not a normal situation and is an outage situation
>> where the weak PE being overwhelmed by a flood of routes.  Do we all agree
>> to the problem statement?
>>
>>
>> Why and why not?
>>
>>
>> b) the need for additional mechanisms to solve the problem,
>>
>> Do other methods exist that can solve the problem and if not do we need a
>> new mechanism to solve this?
>>
>> RTC, Peer maximum prefix, VPN maximum prefix
>>
>> c) a clear description of the technology to solve the problem.
>>
>>
>> Do we all agree that in a normal situation we would never filter on RD as
>> that would partition the VPN which is unwanted and what Robert mentioned.
>> As this is not a normal situation but a unique situation where a weak PE is
>> overwhelmed by a flood of routes.  How best can this be solved?
>>
>>
>>
>> On Fri, Feb 12, 2021 at 10:32 AM Aijun Wang <wangaijun@tsinghua.org.cn>
>> wrote:
>>
>>> Hi, Robert:
>>> Yes, the behavior of the device should be determined. There maybe
>>> several factors to be considered for this local behavior, we should
>>> describe it more clearly in this section later.
>>> We have discussed the differences between RTC and RD-ORF a lot. As Haibo
>>> mentioned, they are not exclusive to each other, and can be used together
>>> in some situations. But they are different and can’t replace each other.
>>>
>>> Aijun Wang
>>> China Telecom
>>>
>>> On Feb 12, 2021, at 23:04, Robert Raszuk <robert@raszuk.net> wrote:
>>>
>>> 
>>>
>>> Sorry Aijun,
>>>
>>> What you say is just handwaving. There is no room for it in any spec.
>>>
>>> When code is written PE must deterministically behave so the RR or any
>>> other network element.
>>>
>>> Statements "decisions of PE2 to judge" are not acceptable in protocol
>>> design.
>>>
>>> Just imagine that each PE does what it feels like in a distributed
>>> network .... Same for BGP same for IGP etc ....
>>>
>>> And all of this is not needed if on ingress between PE1 and HQ1 you
>>> apply max prefix of 2 or even 100. It is also not needed if you enable  RTC
>>> to send RT:TO_HUB from PE2 to RR.
>>>
>>> But I understand - no matter what we say or how much we spend time to
>>> explain why this idea is a bad idea you are still going to push this fwd.
>>> Oh well ...   If I were you I would spend this time to redefine L3VPN such
>>> that customer routes are never needed to be sent to SP core routers.
>>>
>>> Thx,
>>> R.
>>>
>>>
>>> On Fri, Feb 12, 2021 at 3:47 PM Aijun Wang <wangaijun@tsinghua.org.cn>
>>> wrote:
>>>
>>>> Hi, Robert:
>>>>
>>>>
>>>> https://datatracker.ietf.org/doc/html/draft-wang-idr-rd-orf-05#section-5.1.1 has
>>>> described such situations, which will require the additional local
>>>> decisions of PE2 to judge whether to send the RD-ORF message out.
>>>> In your example, if only the HUB VRF exceed but the resources of PE2 is
>>>> not exhausted, then the PE2 will not send the RD-ORF message. It may just
>>>> discard the excessive 100000/32 routes.
>>>> If the resources of PE2 is nearly exhausted, it must send the RD-ORF
>>>> message out. Or else not only the Spoke VRF, but also other VPNs on this
>>>> device can’t be used.
>>>>
>>>> Regarding to RR, it is the same principle: if RR can cope with such
>>>> flooding, it need not send out RD-ORF to PE1. If RR can’t cope with, it
>>>> must send out the RD-ORF message, or else not only the VPN that import RD
>>>> X1 routes can’t work, but also other VPNs that don’t import RD x1 routes.
>>>>
>>>> RD-ORF mechanism just keep the influences as small as possible.
>>>>
>>>> Wish the above explanation can refresh your review of this draft.
>>>>
>>>> We are also hopeful to invite you join us to make RD-ORF mechanism more
>>>> robust and meet the critical challenges.
>>>>
>>>> Aijun Wang
>>>> China Telecom
>>>>
>>>> On Feb 12, 2021, at 19:30, Robert Raszuk <robert@raszuk.net> wrote:
>>>>
>>>> 
>>>> Aijun & Gyan,
>>>>
>>>> Let me try one more (hopefully last time) to explain to both of you -
>>>> and for that matter to anyone how supported this adoption.
>>>>
>>>> Let's consider very typical Hub and Spoke scenario as illustrated
>>>> below:
>>>>
>>>> <image.png>
>>>>
>>>>
>>>> HQ1 is advertising two routes:
>>>>
>>>> - one default with RDX1 with RT TO_SPOKE
>>>> - one or more specifics with RDX1 to the other HUBs
>>>>
>>>> Now imagine HQ1 bought a new BGP "Optimizer" and suddenly is starting
>>>> to advertise 100000 /32 routes just to the other HUB with RT: TO_HUB.
>>>>
>>>> <image.png>
>>>>
>>>>
>>>>
>>>> So PE2 detects this as VRF with RDX2 on it got overwhelmed during
>>>> import with RT TO_HUB and starts pushing RDX1 (original RD) to RR to stop
>>>> getting those routes.
>>>>
>>>> Well all great except now you are throwing baby with the water as all
>>>> spokes attached to PE2 which just import default route to HUB HQ1 also can
>>>> no longer reach their hub site as their default route will be removed.
>>>> Therefor they will have nothing to import with RT:TO_SPOKE
>>>>
>>>> Further if RR "independently" decided ... oh let's push this ORF to PE1
>>>> then all of the spokes attached to perhaps even much more powerful PE3 can
>>>> also no longer reach their headquarters.
>>>>
>>>> - - -
>>>>
>>>> *Summary: *
>>>>
>>>> The above clearly illustrates why the proposed solution to use RD for
>>>> filtering is in fact harmful.
>>>>
>>>> See when you design new protocol extensions the difficulty is to not
>>>> break any existing protocols and deployments.
>>>>
>>>> Hope this puts this long thread to rest now.
>>>>
>>>>
>>>> Thx,
>>>> Robert
>>>>
>>>> --
>>
>> <http://www.verizon.com/>
>>
>> *Gyan Mishra*
>>
>> *Network Solutions A**rchitect *
>>
>>
>>
>> *M 301 502-134713101 Columbia Pike
>> <https://www.google.com/maps/search/13101+Columbia+Pike?entry=gmail&source=g> *Silver
>> Spring, MD
>>
>> --

<http://www.verizon.com/>

*Gyan Mishra*

*Network Solutions A**rchitect *



*M 301 502-134713101 Columbia Pike *Silver Spring, MD