[bess] Fwd: VXLAN BGP EVPN Question

Gyan Mishra <hayabusagsm@gmail.com> Fri, 23 April 2021 22:21 UTC

Return-Path: <hayabusagsm@gmail.com>
X-Original-To: bess@ietfa.amsl.com
Delivered-To: bess@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C9DA03A11A1 for <bess@ietfa.amsl.com>; Fri, 23 Apr 2021 15:21:31 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.086
X-Spam-Level:
X-Spam-Status: No, score=-2.086 tagged_above=-999 required=5 tests=[AC_DIV_BONANZA=0.001, BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_REMOTE_IMAGE=0.01] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gaLv3oGufCdr for <bess@ietfa.amsl.com>; Fri, 23 Apr 2021 15:21:26 -0700 (PDT)
Received: from mail-pj1-x1036.google.com (mail-pj1-x1036.google.com [IPv6:2607:f8b0:4864:20::1036]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D9E5A3A11A0 for <bess@ietf.org>; Fri, 23 Apr 2021 15:21:26 -0700 (PDT)
Received: by mail-pj1-x1036.google.com with SMTP id em21-20020a17090b0155b029014e204a81e6so5137106pjb.1 for <bess@ietf.org>; Fri, 23 Apr 2021 15:21:26 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=huyWwNMANxxm0MBZQJX8SFA65beBzFHgr78PbP8HhFw=; b=E7+Ai7Vf8g8h2xtGfYrdgcQUdetubCa5kV9G8lb5WedPUf3qxiG+/gJQI2qYMSm84H Jn6goDzFPjedEXCSkSL85fuIlOH4Q0eE/zNevj98/P6gwdwB+QqEyeldkK5ys8mGJzp5 L63w9KAk8xAoXuND3sYlGBumTjMHnNbd0quv1EcYQqPzRxEk2GlymPoG7+5xgCopv3BJ zZDAkaMZzFwyK4E+gLrGG0MM3tS2BMff1EpoyGrIJajiK93ld1gAsyANSLZGpmeU0zZo gpvua2a0HEPeZ6O1gH6t8/Ou0DjhGmTBOUXHkwYDpBnx1wej0gRhA5v9wc0F7BxRJI4g f6hQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=huyWwNMANxxm0MBZQJX8SFA65beBzFHgr78PbP8HhFw=; b=Mj16r5gDMMbRbXjMTZKr5sVl240YQOcGaBf8sAmHUCdcMsffJZANxXjleIapFvgmWb ZphOd5QZuxHbpIiM0wSFJndeFFjxsjT6Bg5FfZ99E/kdrJMraI/J2M9g+YZu08nSN/br kSw/PEfD0rZgzkhBJBWvgMvenB02yP7ksQxOIgYM41Sdlr1gWxACS8abyzPiE8MMe4TE a4tPbpx9sjAjhAPdgewnRB6X1VMl6qjrWgIG36SJ/N/0EMPy9YudQu5rWR5B29wGpc4W 1mYjSCDmNBgp/k3/6RO0fBOV84Gt07Xml7SMHAlZ8Q6QR4InKVybIm50wVa5K4Gexbld OQ+Q==
X-Gm-Message-State: AOAM532NPoAedbklIitD8K8OF5Lsgo5FlllWyQfy52ObycUeqita4ON3 vhBP7zcaeBHDOBnXke3bEschQnQt0ePWF/Eqo1Q=
X-Google-Smtp-Source: ABdhPJxI4WDQTkyrs06V2whSM7zbWwFrbH/289z/Jqsr5tp8VxbFI9h6E2fgsYFwS/h329qgL5t3+6/6/RPzDAtONH4=
X-Received: by 2002:a17:902:7616:b029:e9:a757:b191 with SMTP id k22-20020a1709027616b02900e9a757b191mr6221002pll.74.1619216485559; Fri, 23 Apr 2021 15:21:25 -0700 (PDT)
MIME-Version: 1.0
References: <CABNhwV1kPDhRcuqS8GOpTiKoyk_QHKVXa582JUznLvKfXQe54w@mail.gmail.com> <CABNhwV352jKVSu2Jwf9dRgzmjc05gvmLo_CL5EGuR12en-7Z_A@mail.gmail.com> <CABNhwV0Tww-8SxBocQBZ4vuj7DMW44ymN4ux4h1JoaYNENnioQ@mail.gmail.com> <8193067d-3b28-40fc-8c96-3f3c528ece6c@Spark> <CABNhwV227BzgCy4JURSqYE_OZ4EgtETSKJ0jZ3yrHEHLuSqVuQ@mail.gmail.com> <0B536455-62B0-434A-9FD8-D5DBB51ACA9E@nokia.com>
In-Reply-To: <0B536455-62B0-434A-9FD8-D5DBB51ACA9E@nokia.com>
From: Gyan Mishra <hayabusagsm@gmail.com>
Date: Fri, 23 Apr 2021 18:21:14 -0400
Message-ID: <CABNhwV2W6O3=9bNHzpQR76XZvwcQZo46hkFfSGcfs3g+3FP2fw@mail.gmail.com>
To: "Ali Sajassi (sajassi)" <sajassi@cisco.com>, BESS <bess@ietf.org>, Jeff Tantsura <jefftant.ietf@gmail.com>, John E Drake <jdrake@juniper.net>, "Rabadan, Jorge (Nokia - US/Mountain View)" <jorge.rabadan@nokia.com>
Content-Type: multipart/alternative; boundary="0000000000006fb25e05c0ab37d1"
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/FS9gyAiKNbcb-9dDVbxWd8oVDws>
Subject: [bess] Fwd: VXLAN BGP EVPN Question
X-BeenThere: bess@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bess>, <mailto:bess-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess/>
List-Post: <mailto:bess@ietf.org>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bess>, <mailto:bess-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 23 Apr 2021 22:21:32 -0000

Authors

Do we know if this draft will progress to RFC?

https://tools.ietf.org/html/draft-ietf-bess-dci-evpn-overlay-10


This is a very useful draft for intra DC multi pod NVO3 solutions with
multiple vendors.


Thanks

Gyan


---------- Forwarded message ---------
From: Rabadan, Jorge (Nokia - US/Mountain View) <jorge.rabadan@nokia.com>
Date: Fri, Apr 24, 2020 at 3:07 AM
Subject: Re: [bess] VXLAN BGP EVPN Question
To: Gyan Mishra <hayabusagsm@gmail.com>, Jeff Tantsura <
jefftant.ietf@gmail.com>
CC: BESS <bess@ietf.org>


Hi Gyan,



If I may, note that:

https://tools.ietf.org/html/draft-ietf-bess-dci-evpn-overlay-10#section-4.6



Also provides vxlan segmentation, and while the description is based on
DCI, you can perfectly use it for inter-pod connectivity.



Thanks.

Jorge



*From: *BESS <bess-bounces@ietf.org> on behalf of Gyan Mishra <
hayabusagsm@gmail.com>
*Date: *Friday, April 24, 2020 at 5:21 AM
*To: *Jeff Tantsura <jefftant.ietf@gmail.com>
*Cc: *BESS <bess@ietf.org>
*Subject: *Re: [bess] VXLAN BGP EVPN Question





Hi Jeff



Yes - Cisco has a draft for multi site for use cases capability of inter
pod or inter site segmented path between desperate POD fabrics intra DC or
as DCI option inter DC without MPLS.  The segmentation localizes BUM
traffic and has border gateway DF election for BUM traffic that is
segmented stitched between PODs as I mentioned similar to inter-as L3 vpn
opt b.  There is a extra load as you said on the BGW border gateway
performing the network vtep dencap from leaf and then again encap towards
the egress border gateway.  Due to that extra load on the border gateway
it’s not recommended to have spine function on BGW thus an extra layer for
multi site to be scalable.  Definitely requires proprietary asic and not
merchant silicon or white box solution.  The BUM traffic is much reduced as
you stated from multi fabric connected super spine or single fabric spine
that contains all leafs.  That decoupling sounds like incongruent control
and data plane with Mac only Type 2 routes which would result in more BUM
traffic  but it sounds like that maybe trade off of conversation learning
only active flows versus entire data center wide Mac VRF being learned
everywhere.  I wonder if their is an option to have that real decoupling of
EVPN control plane and vxlan data plane overlay that does not impact
convergence but adds stability and only active flow Type 2 Mac learner
across the fabric.



https://datatracker.ietf.org/doc/draft-sharma-multi-site-evpn/



Kind regards



Gyan



On Thu, Apr 23, 2020 at 6:04 PM Jeff Tantsura <jefftant.ietf@gmail.com>
wrote:

Gyan,



"Multi site” is not really an IETF terminology, this is a solution
implement by NX-OS, there’s a draft though. Its main functionality is to
localize VxLAN tunnels and provide segmented path vs end2end full mesh of
VxLAN tunnels (participating in the same EVI). We are talking HER here.

The feature is heavily HW dependent as it requires BUM re-encapsulation at
the boundaries (leaf1->BGW1-BGW2->leaf2..n). So good luck seeing it soon on
low end silicon.

It doesn’t eliminate BUM traffic but significantly reduces the span of
“broadcast domain” and reduces the need for large flood domains (modern HW
gives you ~512 large flood groups, obviously depending on HW)



Wrt your question about Mac conversation learning - this is an
implementation issue, nothing in EVPN specifications precludes you of doing
so, moreover in the implementation I was designing (in my previous life) we
indeed decoupled data plane learning from control plane advertisement so
control plane was aware of “Active” flows.  Needless to say - this creates
 an additional layer of complexity and all kinds of funky states in the
system ;-).



Hope this helps



Cheers,

Jeff

On Apr 23, 2020, 8:30 AM -0700, Gyan Mishra <hayabusagsm@gmail.com>, wrote:





Slight clarification with the arp traffic.  What I meant was broadcast
traffic translated into BUM traffic with the EVPN architecture is there any
way to reduce the amount of BUM traffic with a data center design
requirement with vlan anywhere sprawl with 1000s of type 2 Mac mobility
routes being learned between all the leaf VTEPs.



The elimination of broadcast is a tremendous gain and with broadcast
suppression of multicast that does help but it would be nice to not have
such massive Mac tables type 2 route churn chatter with a conversation
learning where only active flows are are in the type 2 rib.



Kind regards



Gyan



On Wed, Apr 22, 2020 at 6:47 PM Gyan Mishra <hayabusagsm@gmail.com> wrote:



In the description of the vxlan BGP evpn scenario has a typo on the
multisite feature segmented LSP inter pod with the RT auto rewrite which is
similar to MPLS inter-as option b not a.



Kind regards



Gyan



On Wed, Apr 22, 2020 at 5:57 PM Gyan Mishra <hayabusagsm@gmail.com> wrote:



All



Had a question related to vxlan BGP EVPN architecture specifications
defined in BGP EVPN NVO3 overlay RFC 8365 and VXLAN data plane RFC 7348.



In a Data Center environment where you have a multiple PODs individual
fabrics per POD connected via a super spine extension using a Multi site
feature doing auto rewrite of RTs to stitch the NVE tunnel between pods
similar to inter-as option A.



So in this scenario where you have vlan sprawl everywhere with L2 and L3
VNIs everywhere as if it were a a single L2 domain.  The topology is a
typical vxlan spine leaf topology where the L3 leafs are the TOR switch so
very small physical  L2 fault domain. So I was wondering if with the vxlan
architecture if this feature below is possible or if their is a way to do
so in the current specification.



Cisco use to have a DC product called “fabric path” which was based on
conversation learning.



Is there any way with existing vxlan BGP evpn specification or maybe future
enhancement to have a Mac conversation learning capability so that only the
active mac’s that are part of a conversations flow are the mac that are
flooded throughout the vxlan fabric.  That would really help tremendously
with arp storms so if new arp entries are generated locally on a leaf they
are not flooded through the fabric unless their are active flows between
leafs.



Also is there a way to filter type 2 Mac mobility routes between leaf
switches at the control plane level based on remote vtep or maybe other
parameters..  That would also reduce arp storms BUM traffic.



Kind regards



Gyan

--

Gyan  Mishra

Network Engineering & Technology

Verizon

Silver Spring, MD 20904

Phone: 301 502-1347

Email: gyan.s.mishra@verizon.com





--

Gyan  Mishra

Network Engineering & Technology

Verizon

Silver Spring, MD 20904

Phone: 301 502-1347

Email: gyan.s.mishra@verizon.com





--

Gyan  Mishra

Network Engineering & Technology

Verizon

Silver Spring, MD 20904

Phone: 301 502-1347

Email: gyan.s.mishra@verizon.com





_______________________________________________
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess

-- 

Gyan  Mishra

Network Engineering & Technology

Verizon

Silver Spring, MD 20904

Phone: 301 502-1347

Email: gyan.s.mishra@verizon.com




-- 

<http://www.verizon.com/>

*Gyan Mishra*

*Network Solutions A**rchitect *

*Email gyan.s.mishra@verizon.com <gyan.s.mishra@verizon.com>*



*M 301 502-1347*