Re: [bess] shepherd review of draft-ietf-bess-evpn-etree

"Ali Sajassi (sajassi)" <sajassi@cisco.com> Thu, 20 October 2016 06:15 UTC

Return-Path: <sajassi@cisco.com>
X-Original-To: bess@ietfa.amsl.com
Delivered-To: bess@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 31C2112952E; Wed, 19 Oct 2016 23:15:37 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -14.951
X-Spam-Level:
X-Spam-Status: No, score=-14.951 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.431, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5IRl2u6eRY_W; Wed, 19 Oct 2016 23:15:32 -0700 (PDT)
Received: from rcdn-iport-3.cisco.com (rcdn-iport-3.cisco.com [173.37.86.74]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1CFC012946B; Wed, 19 Oct 2016 23:15:32 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=96746; q=dns/txt; s=iport; t=1476944132; x=1478153732; h=from:to:cc:subject:date:message-id:references: in-reply-to:mime-version; bh=ZZmpwyqVpFoKRS6wG9AfkMcu9u/IsO8bPC+0fom4CvE=; b=Wck9dVeQZDuaH7udw0J4YpQ1fndt4oQ1Zk/FvBxAm9Jf32EqtiD2bcpI xaWY6EnFjE85aBBeVKJxA/fZ3w2DVUxC+Wf9SwEyW8fDgsrAMryQny0sx BouSH6fCn74CHip7+cNxHvvS/gkWJKNFbNrgFm29qfsNGH/lYKHn6KfQ3 s=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0AbAQAdYAhY/4UNJK1SChkBAQEBAQEBAQEBAQcBAQEBAYMINgEBAQEBHVd9B40tlnyUO4III4IjAYNaAoIAPxQBAgEBAQEBAQFiKIRiAQEBBBoBDEUBBQQDEAIBCBEDAQIhAQYHMhQJCAIEAQ0FFAeINw7DXAEBAQEBAQEBAQEBAQEBAQEBAQEBARcFixKEH0iFPwWIQItyhVsBhiiDBoZbgW6EaYM3hWuHEoVsg38BHjZUhHRyhhIFgSqBAAEBAQ
X-IronPort-AV: E=Sophos;i="5.31,517,1473120000"; d="scan'208,217";a="163986279"
Received: from alln-core-11.cisco.com ([173.36.13.133]) by rcdn-iport-3.cisco.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 20 Oct 2016 06:15:30 +0000
Received: from XCH-RTP-019.cisco.com (xch-rtp-019.cisco.com [64.101.220.159]) by alln-core-11.cisco.com (8.14.5/8.14.5) with ESMTP id u9K6FTdw013827 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Thu, 20 Oct 2016 06:15:29 GMT
Received: from xch-rtp-005.cisco.com (64.101.220.145) by XCH-RTP-019.cisco.com (64.101.220.159) with Microsoft SMTP Server (TLS) id 15.0.1210.3; Thu, 20 Oct 2016 02:15:28 -0400
Received: from xch-rtp-005.cisco.com ([64.101.220.145]) by XCH-RTP-005.cisco.com ([64.101.220.145]) with mapi id 15.00.1210.000; Thu, 20 Oct 2016 02:15:28 -0400
From: "Ali Sajassi (sajassi)" <sajassi@cisco.com>
To: Thomas Morin <thomas.morin@orange.com>, "draft-ietf-bess-evpn-etree@ietf.org" <draft-ietf-bess-evpn-etree@ietf.org>, Loa Andersson <loa@pi.nu>, "George Swallow -T (swallow - MBO PARTNERS INC at Cisco)" <swallow@cisco.com>, Eric Rosen <erosen@juniper.net>, BESS <bess@ietf.org>
Thread-Topic: shepherd review of draft-ietf-bess-evpn-etree
Thread-Index: AQHSAgSQiVios9dTSUOmTfNWP8T6rqBlA4UAgAGT+oCASmTegA==
Date: Thu, 20 Oct 2016 06:15:28 +0000
Message-ID: <D42D4E86.1BE849%sajassi@cisco.com>
References: <3323ddae-c96f-49a4-2dec-1bfc4ed857dc@orange.com> <D3EA14B3.1B9CAE%sajassi@cisco.com> <6cb41698-b98b-ecbf-9e34-660771bd3fb8@orange.com>
In-Reply-To: <6cb41698-b98b-ecbf-9e34-660771bd3fb8@orange.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/14.6.9.160926
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.19.76.52]
Content-Type: multipart/alternative; boundary="_000_D42D4E861BE849sajassiciscocom_"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/vLULe6wvNWE1P-IplRFnnTl0JBI>
Cc: Martin Vigoureux <martin.vigoureux@nokia.com>
Subject: Re: [bess] shepherd review of draft-ietf-bess-evpn-etree
X-BeenThere: bess@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bess>, <mailto:bess-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess/>
List-Post: <mailto:bess@ietf.org>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bess>, <mailto:bess-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 20 Oct 2016 06:15:38 -0000

Hi Thomas,

Thanks again for your additional comments. Below, please find my comment resolutions. Let me know please if there are any further comments.

Regards,
Ali

From: Thomas Morin <thomas.morin@orange.com<mailto:thomas.morin@orange.com>>
Organization: Orange
Date: Friday, September 2, 2016 at 8:11 AM
To: Cisco Employee <sajassi@cisco.com<mailto:sajassi@cisco.com>>, "draft-ietf-bess-evpn-etree@ietf.org<mailto:draft-ietf-bess-evpn-etree@ietf.org>" <draft-ietf-bess-evpn-etree@ietf.org<mailto:draft-ietf-bess-evpn-etree@ietf.org>>, Loa Andersson <loa@pi.nu<mailto:loa@pi.nu>>, "George Swallow -X (swallow - CLEARPATH WORKFORCE MANAGEMENT INC at Cisco)" <swallow@cisco.com<mailto:swallow@cisco.com>>, Eric Rosen <erosen@juniper.net<mailto:erosen@juniper.net>>, BESS <bess@ietf.org<mailto:bess@ietf.org>>
Cc: Martin Vigoureux <martin.vigoureux@nokia.com<mailto:martin.vigoureux@nokia.com>>
Subject: Re: shepherd review of draft-ietf-bess-evpn-etree
Resent-From: <alias-bounces@ietf.org<mailto:alias-bounces@ietf.org>>
Resent-To: Cisco Employee <sajassi@cisco.com<mailto:sajassi@cisco.com>>, <ssalam@cisco.com<mailto:ssalam@cisco.com>>, <ju1738@att.com<mailto:ju1738@att.com>>, <jdrake@juniper.net<mailto:jdrake@juniper.net>>, <sboutros@vmware.com<mailto:sboutros@vmware.com>>, <jorge.rabadan@nokia.com<mailto:jorge.rabadan@nokia.com>>
Resent-Date: Friday, September 2, 2016 at 8:11 AM

Hi Ali,

Thanks for the quick respin, which covers many of the points.

(inlined below, skipping the resolved points)

2016-09-02, Ali Sajassi (sajassi):

   sites albeit for different EVIs.

                   +---------+            +---------+
                   |   PE1   |            |   PE2   |
    +---+          |  +---+  |  +------+  |  +---+  |            +---+
    |CE1+---ES1----+--+   |  |  | MPLS |  |  |   +--+----ES2-----+CE2|
    +---+  (Root)  |  |MAC|  |  |  /IP |  |  |MAC|  |   (Leaf)   +---+
                   |  |VRF|  |  |      |  |  |VRF|  |
                   |  |   |  |  |      |  |  |   |  |            +---+
                   |  |   |  |  |      |  |  |   +--+----ES3-----+CE3|
                   |  +---+  |  +------+  |  +---+  |   (Leaf)   +---+
                   +---------+            +---------+

   Figure 1: Scenario 1


   In such scenario, an EVPN PE implementation MAY provide E-TREE
   service using topology constraint among the PEs belonging to the same

"topology constraint" is a bit opaque as a term, perhaps "using tailored BGP RT import/export policies" would be more descriptive (assuming I understood your intent)

Done. Changed it to “topology constraint tailored by BGP Route Target (RT) import/export policies"

(I still think that "topology" is not a helpful terme to use here.)

Removed “topology”. It now reads “using tailored Route Target (RT) import/export …."
   EVI. The purpose of this topology constraint is to avoid having PEs
   with only  Leaf sites importing and processing BGP MAC routes from
   each other. To support such topology constrain in EVPN, two BGP
   Route-Targets (RTs) are used for every EVPN Instance (EVI): one RT is
   associated with the Root sites and the other is associated with the
   Leaf sites. On a per EVI basis, every PE exports the single RT
   associated with its type of site(s). Furthermore, a PE with Root
   site(s) imports both Root and Leaf RTs, whereas a PE with Leaf
   site(s) only imports the Root RT.

The text seems to imply that the above is sufficient to deliver the service, but I fail to see what would prevent Leaf-to-Leaf traffic between Leaves bound to the same MAC-VRF (ES2 and ES3 in firgure1).  Shouldn't the text mention the use of a split-horizon in Leaf MAC-VRFs ?

Agree, nice catch!. I changed the first sentence from:
"In such scenario, an EVPN PE implementation MAY provide E-TREE service using topology constraint among the PEs belonging to the same EVI."
TO
"In such scenario, topology constraint, provided by BGP Route Target (RT) import/export policies among the PEs belonging to the same EVI, can be used to restrict the communications among Leaf PEs."

The sentence above does not address my question in fact, which was about communication between Leaf ACs (rather than about communication between Leaf PEs)
Let me restate here, more clearly:  I fail to see what would prevent Leaf-to-Leaf traffic between **ACs** bound to the same MAC-VRF (ES2 and ES3 in firgure1).  Shouldn't the text mention the use of a split-horizon in Leaf MAC-VRFs ?

OK. I mentioned the use of split-horizon filtering explicitly for blocking inter-Leaf communication within the same PE.

"In such scenario, using tailored BGP Route Target (RT) import/export policies among the PEs belonging to the same EVI, can be used to restrict the communications among Leaf PEs. To restrict the communications among leaf sites connected to the same PE  and belonging to the same EVI, split-horizon filtering is used - i.e., the interfaces associated with Leaf sites are placed in the same split-horizon group. "



(assuming the previous point is resolved:)

With this mechanism above, isn't it possible to have on a given PE, for a single E-TREE EVI, both Leaves and Roots, as long as distinct MAC-VRFs are used (one for Leaves and one for Roots) ?   (it seems to me that the assymetric import/export RT would do what is needed to build an E-TREE, we would just have a particular case where a Leaf MAC-VRF and a Root MAC-VRF for a given E-TREE end up on a single PE)

That’s not possible because per definition of an EVI, there is only a single MAC-VRF per EVI for a PE.

Where can I read such a definition ? (the Terminology section in RFC7432 does not say that, unless I'm missing something).
And that seems a completely arbitrary restriction.
(just thinking that a given PE device can be split in two logical devices show that it can work)

Section 6 of RFC7432 where it gives definitions for different service interface types, it specifies the relationship between MAC-VRF and VLAN (bridge table) and how many MAC-VRF (and bridge tables) can be per EVI. In bridging world, there can only be a single bridge table per VLAN in a device.

Besides, I don’t understand what good does it do to have two MAC-VRFs on the same PE (one for Leafs and another for Roots)

Well, the "what is good for" is pretty simple: it means you can have, just by tailoring the import/export policies like in 2.1, something as useful as the scenario in 2.2.

There can only be a single bridge table per VLAN. Now even if you add some kind of logic to form two logical PEs in single physical PE, you end up replicating all the MAC addresses associated with the root sites in two bridge tables.


because Leafs and Roots need to talk to each other and thus we want them to be in the same MAC-VRF.

The fact that Leafs and Roots need to talk to each other does not mean that they *have* to be in the same MAC-VRF, you can rely on the local MPLS dataplane inside the PE to carry the traffic between Roots and Leaves can be passed between a Leaf MAC-VRF and a Root MAC-VRF (and you can possibly implement a shortcut not involving MPLS encap/decap).

Anything is possible but at what cost. The current proposal is very efficient in terms of forwarding path as well as control plane.
However, Leafs should not talk among themselves and thus we can put all the Leaf ACs in a split-horizon group.

Yes, this is the meaning of my initial comment above and it is true independently of whether or not you consider the possibility of having both a Roots MAC-VRF and Leaf MAC-VRF on a same PE.

Yes, incorporated the split-horizon filtering comment.


If this is not possible, I think the text should explain why.

I don’t think we need an explanation because of the above reason but if you think otherwise, then please suggest a text as what do you think I should add.

Two possibilities:
- if indeed there is no possibility of having, for a given E-Tree, both a Root MAC-VRF and a Leaf MAC-VRF, on a given PE, then the text only misses an explanation of why it is not possible - else, if the possibility exists, then it means that the asymetric RT procedure currently described in 2.1 are in fact another way of addressing the scenario supported by 2.2 ("a PE receives traffic from either Root OR Leaf sites (but not both) on a given Attachment Circuit (AC) of an EVI.")  - so the content of 2.1 and 2.2 would be two approaches for supporting this scenario and (2.1 -->  "Approach A, Root MAC-VRF + Leaf MAC-VRF, two RTs", and 2.2 -> "Approach B, Root/Leaf MAC-VRF, single RT" )

As mentioned before, a VLAN can only have a single bridge table. Having two bridge tables results in duplicating some of the MAC addresses. Furthermore, the job of the standard is not to explain what is not “doing” but rather to describe clearly what it is “doing”.



2.2 Scenario 2: Leaf OR Root site(s) per AC

   In this scenario, a PE receives traffic from either Root OR Leaf
   sites (but not both) on a given Attachment Circuit (AC) of an EVI. In
   other words, an AC (ES or ES/VLAN) is either associated with a Root
   or Leaf (but not both).

s/with a Root or Leaf/with Roots or Leaves/ ?

Agree – Changed it to "Root(s) or Leaf(s)"

Re-reading and thinking a bit: "an AC is either a Root AC or a Leaf AC (but not both)" would be much much clearer ?
Done.

or "an AC is either associated as a Root or as a Leaf (but not both)" perhaps.
(but my initial suggestion wasn't great)


                     +---------+            +---------+
                     |   PE1   |            |   PE2   |
    +---+            |  +---+  |  +------+  |  +---+  |            +---+
    |CE1+-----ES1----+--+   |  |  |      |  |  |   +--+---ES2/AC1--+CE2|
    +---+    (Leaf)  |  |MAC|  |  | MPLS |  |  |MAC|  |   (Leaf)   +---+
                     |  |VRF|  |  |  /IP |  |  |VRF|  |
                     |  |   |  |  |      |  |  |   |  |            +---+
                     |  |   |  |  |      |  |  |   +--+---ES2/AC2--+CE3|
                     |  +---+  |  +------+  |  +---+  |   (Root)   +---+
                     +---------+            +---------+

   Figure 2: Scenario 2

   In this scenario, if there are PEs with only root (or leaf) sites per
   EVI, then the RT constrain procedures described in section 2.1 can
   also be used here. However, when a Root site is added to a Leaf PE,
   then that PE needs to process MAC routes from all other Leaf PEs and
   add them to its forwarding table.

This is the case in 2.1 as well, isn't it ?

It can start as 2.1 but as soon as you add Root site to a Leaf PE, then it becomes different (per last sentence of the above para).

I guess we need to first conclude the discussion about the section 2.1, before the above can be discussed efficiently.
Hope it is concluded.


For this scenario, if for a given
   EVI, the majority of PEs will eventually have both Leaf and Root
   sites attached, even though they may start as Root-only or Leaf-only
   PEs, then it is recommended to use a single RT per EVI and avoid
   additional configuration and operational overhead.

Why this recommendation ?
Even with a majority of PEs having both Leaves and Roots, there can remain (up to 49% of) PEs having only Leaves, which will uselessly have all routes to other Leaves.

So "it is recommended" above, deserves to be explained more, I think.

OK, I changed “majority” to “vast majority” :-)

My point was not to nit pick on "majority", but was that you should explain why you recommend that.
As the text currently reads, the cost of the recommendation can be identified: having useless routes on the fraction of PEs having only Leaves.
But the gain brought by the recommendation is not even mentioned, not to say explained.
Hence: why ?
(Why is it a useful tradeoff to have useless routes on some, even if only one, PE ?)

Changed the last sentence from:
"then it is recommended to use a single RT per EVI and avoid additional configuration and operational overhead.”
To
"then it is recommended to use a single RT per EVI and avoid additional configuration and operational overhead at the expense of having unwanted MAC addresses on the Leaf PEs."


is on a per MAC address. This scenario is considered in
   this draft for EVPN service with only known unicast traffic - i.e.,
   there is no BUM traffic.

"there is no BUM" is quite a bold claim ! :=

Maybe the text should say "no BUM traffic is supported (BUM traffic will be dropped)" ?

(possibly "BUM traffic from Leaves will be dropped" would be sufficient ?)

Changed it to “BUM traffic is not supported in this scenario and it is dropped”.

adding "by the ingress PE" ?

Done.


                     +---------+            +---------+
                     |   PE1   |            |   PE2   |
    +---+            |  +---+  |  +------+  |  +---+  |            +---+
    |CE1+-----ES1----+--+   |  |  |      |  |  |   +--+---ES2/AC1--+CE2|
    +---+    (Root)  |  | E |  |  | MPLS |  |  | E |  | (Leaf/Root)+---+
                     |  | V |  |  |  /IP |  |  | V |  |
                     |  | I |  |  |      |  |  | I |  |            +---+
                     |  |   |  |  |      |  |  |   +--+---ES2/AC2--+CE3|
                     |  +---+  |  +------+  |  +---+  |   (Leaf)   +---+
                     +---------+            +---------+

   Figure 3: Scenario 3

3 Operation for EVPN

   [RFC7432] defines the notion of ESI MPLS label used for split-horizon
   filtering of BUM traffic at the egress PE. Such egress filtering
   capabilities can be leveraged in provision of E-TREE services as seen
   shortly. In other words, [RFC7432] has inherent capability to support
   E-TREE services without defining any new BGP routes but by just
   defining a new BGP Extended Community for leaf indication as shown
   later in this document.

3.1 Known Unicast Traffic

   Since in EVPN, MAC learning is performed in control plane via
   advertisement of BGP routes, the filtering needed by E-TREE service
   for known unicast traffic can be performed at the ingress PE, thus
   providing very efficient filtering and avoiding sending known unicast
   traffic over MPLS/IP core to be filtered at the egress PE as done in
   traditional E-TREE solutions (e.g., E-TREE for VPLS).

   To provide such ingress filtering for known unicast traffic, a PE
   MUST indicate to other PEs what kind of sites (root or leaf) its MAC
   addresses are associated with by advertising a leaf indication flag
   (via an Extended Community) along with each of its MAC/IP
   Advertisement route. The lack of such flag indicates that the MAC
   address is associated with a root site.



  This scheme applies to all
   scenarios described in section 2.

   Furthermore, for multi-homing scenario of section 2.2, where an AC is
   either root or leaf (but not both), the PE MAY advertise leaf
   indication along with the Ethernet A-D per EVI route. This
   advertisement is used for sanity checking in control-plane to ensure
   that there is no discrepancy in configuration among different PEs of
   the same redundancy group. For example, if a leaf site is multi-homed
   to PE1 an PE2, and PE1 advertises the Ethernet A-D per EVI
   corresponding to this leaf site with the leaf-indication flag but PE2
   does not, then the receiving PE notifies the operator of such
   discrepancy and ignore the leaf-indication flag on PE1. In other
   words, in case of discrepancy, the multi-homing for that pair of PEs
   is assumed to be in default "root" mode for that <ESI, EVI> or <ESI,
   EVI/VLAN>. The leaf indication flag on Ethernet A-D per EVI route
   tells the receiving PEs that all MAC addresses associated with this
   <ESI, EVI> or <ESI, EVI/VLAN> are from a leaf site. Therefore, if a
   PE receives a leaf indication for an AC via the Ethernet A-D per EVI
   route but doesn't receive a leaf indication in the corresponding MAC
   route,then it notify the operator and ignore the leaf indication on
the Ethernet A-D per EVI route.


The procedure above should I think be rephrased to provide unambiguous interpretation in the case where a given MAC is being announced in more than one MAC/IP advertisement route, possibly carrying a different leaf indication (and even possibly from different ESes, or from PEs not advertising Ethernet A-D route).

Are you talking about MAC move where a MAC can move between Root and Leaf sites? If so, MAC mobility procedure takes precedence. I have added the following paragraph toward the end of this section:
"In situation where MAC moves are allowed among Leaf and Root sites (e.g., non-static MAC), PEs can receive multiple MAC/IP advertisements routes for the same MAC address with different Leaf/Root indications (and possibly different ESIs for multi-homing scenarios). In such situations, MAC mobility procedures take precedence to first identify the location of the MAC before associating that MAC with a Root or a Leaf site."


   Tagging MAC addresses with a leaf indication enables remote PEs to
   perform ingress filtering for known unicast traffic - i.e., on the
   ingress PE, the MAC destination address lookup yields, in addition to
   the forwarding adjacency, a flag which indicates whether the target
   MAC is associated with a Leaf site or not.

Ditto, more or less: the procedure above should I think be rephrased to provide unambiguous interpretation in the case where a given MAC is being announced in more than one MAC/IP advertisement route, possibly carrying a different leaf indication.

The new paragraph will take care of it.

The new paragraph takes care of the MAC mobility case, but there possibly remains the case of a MAC being advertised in two distinct MAC/IP advertisement route for a same dual-homed ES, in the case where this ES is flagged as Leaf or Root consistently from the two dual-homing PEs.

In that case, both advertisement carry the same indication (either Root or Leaf). So, there is no issue !! This is normal EVPN process where a MAC for an all-active multi-homed ES can get advertised by all the multi-homing PEs and the receiving PE build multiple adjacencies for that MAC.

The ingress PE cross-
   checks this flag with the status of the originating AC, and if both
   are Leafs, then the packet is not forwarded.

   To support the above ingress filtering functionality, a new E-TREE
   Extended Community with a Leaf indication flag is introduced [section
   5.2]. This new Extended Community MUST be advertised with MAC/IP
   Advertisement route and MAY be advertised with an Ethernet A-D per
   EVI route as described above.

3.2 BUM Traffic

   For BUM traffic, it is not possible to perform filtering on the
   ingress PE, as is the case with known unicast, because of the multi-
   destination nature of the traffic.

Saying "it is not possible" without more explanation is not very useful (the reader may think about using RPF-like techniques on the egress PE).
It seems to me more reasonable to formulate things in terms of "This specification does not provide support for filtering BUM traffic on the ingress PE", and avoid a sentence like the one above.

OK, Changed the sentence to:
"This specification does not provide support for filtering BUM traffic on the ingress PE because it is not possible to perform filtering of BUM traffic on the ingress PE, as is the case with known unicast described above, due to the multi-destination nature of BUM traffic."

Ok.




As such, the solution relies on
   egress filtering. In order to apply the proper egress filtering,
   which varies based on whether a packet is sent from a Leaf AC or a
   root AC, the MPLS-encapsulated frames MUST be tagged with an
   indication when they originated from a Leaf AC. In other words, leaf
   indication for BUM traffic is done at the granularity of AC. This can
   be achieved in EVPN through the use of a MPLS label where it can be
   used to either identify the Ethernet segment of origin per [RFC7432]
   (i.e., ESI label) or it can be used to indicate that the packet is
   originated from a leaf site (Leaf label).

   BUM traffic sent over a P2MP LSP or ingress replication, may need to
   carry an upstream assigned or downstream assigned MPLS label
   (respectively) for the purpose of egress filtering to indicate to the
   egress PEs whether this packet is originated from a leaf AC.

   The main difference between downstream and upstream assigned MPLS
   label is that in case of downstream assigned not all egress PE
   devices need to receive the label just like ingress replication
   procedures defined in [RFC7432].

   There are four scenarios to consider as follow. In all these
   scenarios, the imposition PE imposes the right MPLS label associated
   with the originated Ethernet Segment (ES) depending on whether the
   Ethernet frame originated from a Root or a Leaf site on that Ethernet
   Segment (ESI or Leaf label).

The mechanism by which the PE identifies
   whether a given frame originated from a Root or a Leaf site on the
   segment is based on the Ethernet Tag associated with the frame (e.g.,
   whether the frame received on a leaf or a root AC).

First comment: it seems that the formulation should also support the case where an AC does not use .1q.

Agree. Change the sentence to:
"The mechanism by which the PE identifies whether a given frame originated from a Root or a Leaf site on the segment is based on the AC identifier for that segment (e.g., Ethernet Tag of the frame for 802.1Q frames). Other mechanisms for identifying root or leaf (e.g., on a per MAC address basis) is beyond the scope of this document."


Ok.


(side comment: doing the identification based on the source MAC address would seem to allow BUM in the context of 2.3; it is out of the scope of my review to extend the scope of these specs, but I'm curious why it is not proposed....)

If we went for per MAC root/leaf identification, then this would have expanded the scope of DF election and egress filtering beyond that of RFC 7432. Currently, we don’t have any such requirements from operators and service providers.

Does the above mean that scenario 2.3 excludes BUM because the DF Election mechanism would not be compatible with the egress filtering mechanism ?
Providing the explanation in 2.3 would I think be helpful.

Done.


4.2 BUM Traffic

   For BUM traffic, the PEs must perform egress filtering. When a PE
   receives a MAC advertisement route (which will be used as a source B-
   MAC), it updates its Ethernet Segment egress filtering function

The "its Ethernet Segment egress filtering function" phrase makes it sounds like we're talking about a wellknown function defined somewhere.
If this is indeed the case, providing a reference would be in order.
If not, then explaining what this function is would be required.

Changed the sentence to:
"When a PE receives a MAC advertisement route (which will be used as a source B-MAC for BUM traffic), it updates its egress filtering (based on the source B-MAC address), as follows:"

(Are you talking about doing something similar to what 3.2 specifies for the non-PBB procedures ?)
Correct. Similar to 3.2 but based on B-MAC address.

Ok.


   (based on the source B-MAC address), as follows:

   - If the MAC Advertisement route indicates that the advertised B-MAC
   is a Leaf, and the local Ethernet Segment is a Leaf as well, then the
   source B-MAC address is added to the B-MAC filtering list.
Changed it to:
“… is added to its B-MAC list used for egress filtering."

Implicitly we can guess that this "filtering list" is a list of things to include, rather than a list of things to include, but the text should I think be explicit.

Changed it as above.

We still don't know if the list is a list of B-MAC to reject or to accept ?
(filter out what is specified in the list vs. filter to keep only what is specified in the list)

Change the sentence to:
"then the source B-MAC address is added to its B-MAC list used for egress filtering - i.e., to block traffic from that B-MAC address."



5.2 PMSI Tunnel Attribute

   [RFC6514] defines PMSI Tunnel attribute which is an optional
   transitive attribute with the following format:

         +---------------------------------+
         |  Flags (1 octet)                |
         +---------------------------------+
         |  Tunnel Type (1 octets)         |
         +---------------------------------+
         |  MPLS Label (3 octets)          |
         +---------------------------------+
         |  Tunnel Identifier (variable)   |
         +---------------------------------+

   This draft uses all the fields per existing definition except for the
   following modifications to the Tunnel Type and Tunnel Identifier:

   When receiver ingress-replication label is needed, the high-order bit
   of the tunnel type field (C bit - Composite tunnel bit) is set while
   the remaining low-order seven bits indicate the tunnel type as
   before. When this C bit is set, the "tunnel identifier" field would
   begin with a three-octet label, followed by the actual tunnel
   identifier for the transmit tunnel.  PEs that don't understand the
   new meaning of the high-order bit would treat the tunnel type as an
   invalid tunnel type. For the PEs that do understand the new meaning
   of the high-order, if ingress replication is desired when sending BUM
   traffic, the PE will use the the label in the Tunnel Identifier field
   when sending its BUM traffic.


Additionally, since RFC7385 has created a registry for PMSI Tunnel attribute tunnel types, taking the most significant bit from this field can't be done without a significant change of how this registry is organized  (because now you can't take value in 0x7b-0x7f without colliding into values which are Experimental or Reserved).

Achieving the above requires an update of RFC7385, so I would suggest adding an 8.1 section saying this:

---
The "P-Multicast Service Interface Tunnel (PMSI Tunnel) Tunnel Types" registry in the "Border Gateway Protocol (BGP) Parameters" registry needs to be updated to reflect the use of the most significant bit to advertise the use of "composite tunnels" (section 5.2).

For this purpose, this document updates RFC7385.

The registry is to be updated, by removing the entries for 0xFB-0xFE and 0x0F, and replacing them by:
- 0x7B-0x7E Reserved for Experimental Use [this document]
- 0x7F  Reserved [this document]
- 0x80-0xFF Not Allocatable, corresponds to Composite tunnel types [this document]

The allocation policy for values 0x00 to 0x7A is IETF Review [RFC5226<https://tools.ietf.org/html/rfc5226>].
The range for experimental use is now 0x7B-0x7E, and value in this range are not to be assigned.
The status of 0x7F may only be changed through Standards Action [RFC5226<https://tools.ietf.org/html/rfc5226>].

Done. Thanks for providing the text. It was very helpful.

Ok.
One thing: in the revised text, line breaks are missing for the bullet list ("- 0x7B-0x7E Reserved for Experimental Use [this document]- 0x7F Reserved [this document]- 0x80-0xFF Not Allocatable, corresponds to Composite tunnel types [this document]").
Done.