Re: [bess] draft-mohanty-bess-evpn-bum-opt-00 - clarification on problem description

"Satya Mohanty (satyamoh)" <> Sat, 24 March 2018 14:00 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id 8648C126DD9 for <>; Sat, 24 Mar 2018 07:00:33 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -12.521
X-Spam-Status: No, score=-12.521 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=1.989, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: (amavisd-new); dkim=pass (1024-bit key)
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id VFYwO3pdPN3E for <>; Sat, 24 Mar 2018 07:00:29 -0700 (PDT)
Received: from ( []) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 4D22A1201F2 for <>; Sat, 24 Mar 2018 07:00:29 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple;;; l=68708; q=dns/txt; s=iport; t=1521900029; x=1523109629; h=from:to:subject:date:message-id:references:in-reply-to: mime-version; bh=A7RBwITUxAzMjhJCIKcHs8u5kAjj+lUbB+NgId7wnUc=; b=dMsypr/ShjsJoggM+blRIGLpTTaevNamcj2U9kmuFqpKDAV1Ku3nW9Tx jW4lclST9Bt0JpRniq6gOPYwXnsEcKnbKJuWaixc9hnOtgAEZMQCaU4F6 OCMmYKANbnRMfMNmQ2aEzOY+7ly4z8brQI19RLuswa10ukkb4fIojCmVl o=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-AV: E=Sophos; i="5.48,355,1517875200"; d="scan'208,217"; a="88824023"
Received: from ([]) by with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 24 Mar 2018 14:00:28 +0000
Received: from ( []) by (8.14.5/8.14.5) with ESMTP id w2OE0RYT006527 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Sat, 24 Mar 2018 14:00:27 GMT
Received: from ( by ( with Microsoft SMTP Server (TLS) id 15.0.1320.4; Sat, 24 Mar 2018 10:00:26 -0400
Received: from ([]) by ([]) with mapi id 15.00.1320.000; Sat, 24 Mar 2018 10:00:26 -0400
From: "Satya Mohanty (satyamoh)" <>
To: Sandy Breeze <>, "Rabadan, Jorge (Nokia - US/Mountain View)" <>, Eric C Rosen <>, "" <>
Thread-Topic: [bess] draft-mohanty-bess-evpn-bum-opt-00 - clarification on problem description
Thread-Index: AQHTweW4pKI1nWgbp0ukIa0wdFUT3aPcm6uAgAG0fQCAAS0CgIAAC7gAgAAnSIA=
Date: Sat, 24 Mar 2018 14:00:26 +0000
Message-ID: <>
References: <> <> <> <> <> <>
In-Reply-To: <>
Accept-Language: en-US
Content-Language: en-US
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: []
Content-Type: multipart/alternative; boundary="_000_BDDBA9FB18AC497280620EEFD5F7375Bciscocom_"
MIME-Version: 1.0
Archived-At: <>
Subject: Re: [bess] draft-mohanty-bess-evpn-bum-opt-00 - clarification on problem description
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: BGP-Enabled ServiceS working group discussion list <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Sat, 24 Mar 2018 14:00:34 -0000

Some additional comments inline [Satya]

From: BESS <<>> on behalf of Sandy Breeze <<>>
Date: Saturday, March 24, 2018 at 11:39 AM
To: "Rabadan, Jorge (Nokia - US/Mountain View)" <<>>, Eric C Rosen <<>>, "<>" <<>>
Subject: Re: [bess] draft-mohanty-bess-evpn-bum-opt-00 - clarification on problem description

John, Eric, Jorge,

[Sandy] Comments inline

On 24/03/2018, 10:57, "Rabadan, Jorge (Nokia - US/Mountain View)" <<>> wrote:

Eric, as discussed and you point out, one can easily interpret that IMET is not mandatory in some cases where multi-destination traffic is not needed. In any case, whether this document is Informational or Standards Track is probably not that important.

If this had to be done, out of the options you list, I think omitting the PTA would not be backwards compatible since the use of PTA is a MUST in RFC7432, so RRs wouldn’t like it. Maybe label zero could cause issues too. So maybe a flag is the least disruptive one if the document has to modify something.
[Sandy] I agree, and detailed reasoning on those points in line below

I still think it may be better to proceed with the IMET withdraw procedure and clarify that it is only valid for:
a) BUM traffic in IR cases
b) BDs with no igmp/mld/pim proxy
c) BDs with no OISM or IRBs
d) BDs with I-ES associated to overlay tunnels and no other ACs
[Sandy] We’re happy to call out those restrictions under a definition of what we mean in connecting EVPN domains at DCI for L2 only scenarios
[Satya] Yes, This is applicable for the restricted cases as stated above.

And any other restrictions/caveats we may need to add.

My 2 cents.

On 23/03/2018, 17:00, "Eric C Rosen" <<>> wrote:

[Jorge] my interpretation of RFC7432 is that IMET routes are mandatory to enable the handling of multi-destination traffic in a BD. But in a non-DF PE for a given ES and with no other ACs in the BD, assuming Ingress Replication, there is no such multi-destination traffic (Tx or Rx). So one could interpret that RFC7432 is ok with withdrawing the IMET route in that case.

If we consider the case of all-active multi-homing, then there may well be Tx multi-destination traffic in the scenario under discussion, as multicast traffic from a given ES could arrive at any PE attached to the ES, whether or not that PE is the DF.

[Sandy] In the specific scenario with all-active multi-homing, EVPN GW / DCI routers connect an EVPN VXLAN fabric and an EVPN MPLS core.  Therefore, for all NDF PE’s (read EVPN GW’s) there will be no IMET sent to neither RR’s in the EVPN MPLS core, nor to ToR in the EVPN VXLAN fabric – so neither side of the EVPN GW would attract BUM.

The relevant section from RFC 7432 is:

11.  Handling of Multi-destination Traffic

   Procedures are required for a given PE to send broadcast or multicast
   traffic received from a CE encapsulated in a given Ethernet tag
   (VLAN) in an EVPN instance to all the other PEs that span that
   Ethernet tag (VLAN) in that EVPN instance.  In certain scenarios, as
   described in Section 12 ("Processing of Unknown Unicast Packets"), a
   given PE may also need to flood unknown unicast traffic to other PEs.

   The PEs in a particular EVPN instance may use ingress replication,
   P2MP LSPs, or MP2MP LSPs to send unknown unicast, broadcast, or
   multicast traffic to other PEs.

   Each PE MUST advertise an "Inclusive Multicast Ethernet Tag route" to
   enable the above.  The following subsection provides the procedures
   to construct the Inclusive Multicast Ethernet Tag route.  Subsequent
   subsections describe its usage in further detail.

Interestingly, this says that the IMET route is mandatory to enable "the above", where "the above" is "send broadcast or multicast traffic received from a CE".  Note it says "send", not "receive".

[Sandy] This is a very good point Eric.  Its my understanding the reasons for ‘sending’ IMET is because you want to signal eligibility to receive BUM traffic.  ‘send’ vs ‘receive’ wording aside, I agree with the RFC7432 in that as per my first comment above, those PE’s would not attract BUM from the EVPN VXLAN fabric side, because they also wouldn’t be sending the IMET to ToR, if they’re NDF…

If P2MP tunnels are used for the BUM traffic, the IMET route is certainly required to support all-active multi-homing.
[Sandy] I concur

If IR is used, or if single-active multi-homing is used, one could argue that RFC 7432 didn't really need to require the IMET route.  However, it does.
[Sandy] I do not concur, as per my previous point, and noting that 7432 wording asserts that sending IMET enables a PE to forward BUM received from CE… but in this scenario it will never receive BUM from the EVPN VXLAN fabric side, because we wont send IMET to ToR

[John] Wouldn’t it be better to have this draft define a bit in the Multicast Flags extended community (<>) indicating that that the originating PE is neither DF nor backup DF for this broadcast domain on any ES to which it is attached?  This allows us to always advertise the IMET route and makes the situation explicit.  I think the consensus is that this situation is rare so the number of IMET route updates shouldn’t be excessive and we could also say that this bit is only set by EVPN DC GWs.
[Sandy] We discussed the use of extended community to signal NDF, this is indeed a viable alternative approach and one we’re not against.  We didn’t choose it over not sending IMET because we don’t have a good reason why not sending IMET at an NDF is actually a bad idea, for our use case.  That said, if the consensus of this list is to use an extended community then a flag in EVPN extended community sub-types registry is a possible fit
[Satya] Just to make clear, the Multicast Flags extended community is only sent with the IMET when IGMP proxy is supported. If the BD does not support IGMP/MLD, then this won’t work as Jorge pointed out earlier (BUM with IR only). Perhaps, if there  were available flag fields in the PMSI Tunnel attr, one could have used it. Also, creation of a new extended community for the purpose of this optimization looks to me to be adding extra complexity to the CP. So we did not prefer doing that. Setting the label to 0 would have worked, but the value 0 now has the semantics that Sandy points out below.

If it's worth doing at all, this would be a better method.  Alternatives would be omitting the PMSI Tunnel attribute, or setting the MPLS label in the PMSI Tunnel attribute to 0.
[Sandy] Label field in PTA is used to carry VNI in overlay signalling (section 5.1.3 in draft-ietf-bess-evpn-overlay-11).  Setting this to 0 is also described in this section of bess-evpn-overlay draft, to mean single-VNI per MAC-VRF.  I suspect sending IMET but omitting PTA would likely just invalidate the RT3 completely in bess-evpn-overlay, hence achieving same result as not sending in the first place, given the language in section 9 is MUST send IMET to indicate what tunnel-type is to be used for multicast.  Also interestingly in section 9 support for multicast, I see wording such as;
   However, for globally-assigned VNIs, each PE MUST advertise IMET
  route to other PEs in an EVPN instance for ingress replication or
   PIM-SSM tunnel, and MAY advertise IMET route for PIM-SM or Bidir-PIM
   tunnel. In case of PIM-SM or Bidir-PIM tunnel, no information in the
   IMET route is needed by the PE to setup these tunnels.
in addition, in section 4.1 of
   A PE advertises an SMET route for that (x, G) group in that
   [EVI, BD] when it has IGMP Join (x, G) state in that [EVI, BD] on at
   least one ES for which it is DF and it withdraws that SMET route when
   it does not have IGMP Join (x, G) state in that [EVI, BD] on any ES
   for which it is DF.
I’ve heard no objections on these drafts permitting the selective sending of [S|I]METand seems to be similar logic as we’re trying to apply

[Sandy] We’d considered alternative methods other than withdraw, such as extended community or something specific in PMSI Tunnel Attribute.  Withdraw/don’t advertise RT3 approach was chosen for the following reasons;
·         Requires no change to protocol
Since the proposal changes the conditions under which an IMET route is originated, it is certainly changing the protocol.  (It's obvious that the finite state machine is changed.)  Perhaps what is meant is that the protocol change is backwards compatible with systems that implement only RFC7432.
[Sandy] ACK
[Satya] Yes
But it does not appear to be backwards compatible with systems that have IRB, and the draft has no analysis of the impact on all the various extensions and proposed extensions to RFC7432.
[Sandy] IRB was not our use case, and yes we acknowledge the -00 is missing written analysis of other drafts and standards, though I assure you they have been discussed.  IRB multicast draft was not taken into account however, as IRB is not our use case
[Satya] Yes, the draft was written very late and does not have the anlysis of the extensions to RFC7432.
As Sandy mentioned above, after getting your feedback over the email, we went over the following drafts again.

We did not find anything so far that is an anamoly.
However, if a conflict is found later, we will address it.

·         Is computationally easier on all participating PE’s, to deal with a simple withdraw than to look for something in an update.  For instance, on transition from BDF to NDF for example
These are of course not the only considerations.

[Sandy] ACK
***** Previous Discussion with John ************

[Sandy] I don’t think in my specific case, I’m reliant on setting ESI to zero to since I am only 1 ES per BD. In the more general case however, as Satya mentions in an earlier thread, this may be desirable to set ESI to 0 for optimal NDF position where all BD’s share the same ES and there are many ES’s.

[JD]  I have heard this assertion from both you and Satya.  Do you have any evidence to support it?
[Satya] An EVI can be present on more than two ES. That is what is meant. And those two ES can have the exact same set of EVIs.
This can be done by configuration and is a common case. Nothing specific to the draft.