Re: [bess] shepherd review of draft-ietf-bess-evpn-etree

"Ali Sajassi (sajassi)" <sajassi@cisco.com> Thu, 01 September 2016 22:05 UTC

Return-Path: <sajassi@cisco.com>
X-Original-To: bess@ietfa.amsl.com
Delivered-To: bess@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 36B0712B014; Thu, 1 Sep 2016 15:05:34 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -15.069
X-Spam-Level:
X-Spam-Status: No, score=-15.069 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.548, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qeRFqPixK3dk; Thu, 1 Sep 2016 15:05:26 -0700 (PDT)
Received: from rcdn-iport-1.cisco.com (rcdn-iport-1.cisco.com [173.37.86.72]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 118B412DB17; Thu, 1 Sep 2016 15:05:26 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=122681; q=dns/txt; s=iport; t=1472767526; x=1473977126; h=from:to:cc:subject:date:message-id:references: in-reply-to:mime-version; bh=BR2gcB3eKpu+iSEzrsxbljws/hHcGKq3clO2r31aPwY=; b=OgpQ0MMchSkJ+4gi/QmkMpQntixBGhnYK5SdV2OWb5njSkTai3Zlf//O htD/QLJhGU1Rq0DcYRczZmv15CfphBVvyOCKtJpI0HkW07ES7m1Qn3AFc ag+WnATTiAUpUTQXoVcR0G579EcpUrcmnH4P1+VBArmejKouE4sX6mfsS k=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0BxAgAkpchX/51dJa1TCoMdMwEBAQEBHld8B7gqggIkgh0Bg1oCgVY4FAECAQEBAQEBAV4nhGEBAQUaAQxABgwQAgEIEQMBAiEBBgcyFAkIAgQBDQMCFAeILQ66ewEBAQEBAQEBAQEBAQEBAQEBAQEBARcFinyEBxEFQwwMhSQBBIgthXeFZYVHAYYfgwGGEIFthF2JDYZwhViDeAEeNluBbgUWgU0/MQGEPQEEAh+BCX8BAQE
X-IronPort-AV: E=Sophos;i="5.30,268,1470700800"; d="scan'208,217";a="148048398"
Received: from rcdn-core-6.cisco.com ([173.37.93.157]) by rcdn-iport-1.cisco.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 01 Sep 2016 22:05:23 +0000
Received: from XCH-RTP-001.cisco.com (xch-rtp-001.cisco.com [64.101.220.141]) by rcdn-core-6.cisco.com (8.14.5/8.14.5) with ESMTP id u81M5N6J003779 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Thu, 1 Sep 2016 22:05:23 GMT
Received: from xch-rtp-005.cisco.com (64.101.220.145) by XCH-RTP-001.cisco.com (64.101.220.141) with Microsoft SMTP Server (TLS) id 15.0.1210.3; Thu, 1 Sep 2016 18:05:22 -0400
Received: from xch-rtp-005.cisco.com ([64.101.220.145]) by XCH-RTP-005.cisco.com ([64.101.220.145]) with mapi id 15.00.1210.000; Thu, 1 Sep 2016 18:05:22 -0400
From: "Ali Sajassi (sajassi)" <sajassi@cisco.com>
To: Thomas Morin <thomas.morin@orange.com>, "draft-ietf-bess-evpn-etree@ietf.org" <draft-ietf-bess-evpn-etree@ietf.org>, Loa Andersson <loa@pi.nu>, "George Swallow -X (swallow - CLEARPATH WORKFORCE MANAGEMENT INC at Cisco)" <swallow@cisco.com>, Eric Rosen <erosen@juniper.net>, BESS <bess@ietf.org>
Thread-Topic: shepherd review of draft-ietf-bess-evpn-etree
Thread-Index: AQHSAgSQiVios9dTSUOmTfNWP8T6rqBlA4UA
Date: Thu, 01 Sep 2016 22:05:22 +0000
Message-ID: <D3EA14B3.1B9CAE%sajassi@cisco.com>
References: <3323ddae-c96f-49a4-2dec-1bfc4ed857dc@orange.com>
In-Reply-To: <3323ddae-c96f-49a4-2dec-1bfc4ed857dc@orange.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/14.6.7.160722
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.19.76.58]
Content-Type: multipart/alternative; boundary="_000_D3EA14B31B9CAEsajassiciscocom_"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/8FGarK02jft8wUFEojlBLYWM_VI>
Cc: "Ali Sajassi (sajassi)" <sajassi@cisco.com>, Martin Vigoureux <martin.vigoureux@nokia.com>
Subject: Re: [bess] shepherd review of draft-ietf-bess-evpn-etree
X-BeenThere: bess@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bess>, <mailto:bess-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess/>
List-Post: <mailto:bess@ietf.org>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bess>, <mailto:bess-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 01 Sep 2016 22:05:34 -0000

Hi Thomas,

Thanks very much for your thorough review and valuable comments. I have resolved all of them and incorporated all the required modifications. Please refer to the comment resolutions below and let me know if there are any further comments. You can find the latest rev of the draft with the incorporated changes in:

https://www.ietf.org/id/draft-ietf-bess-evpn-etree-07.txt

Thanks,
Ali

From: Thomas Morin <thomas.morin@orange.com<mailto:thomas.morin@orange.com>>
Organization: Orange
Date: Monday, August 29, 2016 at 7:49 AM
To: "draft-ietf-bess-evpn-etree@ietf.org<mailto:draft-ietf-bess-evpn-etree@ietf.org>" <draft-ietf-bess-evpn-etree@ietf.org<mailto:draft-ietf-bess-evpn-etree@ietf.org>>, Loa Andersson <loa@pi.nu<mailto:loa@pi.nu>>, "George Swallow -X (swallow - CLEARPATH WORKFORCE MANAGEMENT INC at Cisco)" <swallow@cisco.com<mailto:swallow@cisco.com>>, Eric Rosen <erosen@juniper.net<mailto:erosen@juniper.net>>, BESS <bess@ietf.org<mailto:bess@ietf.org>>
Cc: Martin Vigoureux <martin.vigoureux@nokia.com<mailto:martin.vigoureux@nokia.com>>
Subject: shepherd review of draft-ietf-bess-evpn-etree
Resent-From: <alias-bounces@ietf.org<mailto:alias-bounces@ietf.org>>
Resent-To: Cisco Employee <sajassi@cisco.com<mailto:sajassi@cisco.com>>, <ssalam@cisco.com<mailto:ssalam@cisco.com>>, <ju1738@att.com<mailto:ju1738@att.com>>, <jdrake@juniper.net<mailto:jdrake@juniper.net>>, <sboutros@vmware.com<mailto:sboutros@vmware.com>>, <jorge.rabadan@nokia.com<mailto:jorge.rabadan@nokia.com>>
Resent-Date: Monday, August 29, 2016 at 7:49 AM


Hi,

Here is the review I did while preparing the shepherd write-up for draft-ietf-bess-evpn-etree .

There is one thing that stands out: the document uses the most significant bit of the "PMSI Tunnel Attribute" Tunnel Type  field, which cant't be done I think without updating the structure of the corresponding registry (RFC7385).  I suggest how that can be done at the end of this review (a few people Cc'd for this reason).

Please find my comments below...

[...]

2  E-Tree Scenarios and EVPN / PBB-EVPN Support

   In this section, we will categorize support for E-Tree into three
   different scenarios, depending on the nature of the site association
   (Root/Leaf) per PE or per Ethernet Segment:

   - Leaf OR Root site(s) per PE

   - Leaf OR Root site(s) per AC

   - Leaf OR Root site(s) per MAC

2.1 Scenario 1: Leaf OR Root site(s) per PE

   In this scenario, a PE may receive traffic from either Root sites OR
   Leaf sites for a given MAC-VRF/bridge table, but not both
   concurrently. In other words, a given EVI on a PE is either
   associated with a root or leaf.

s/with a root or leaf/with roots or leaves/ ?

Done. Changed it to “root(s) or leaf(s)"

The PE may have both Root and Leaf

s/both Root and Leaf/both Roots and Leaves/ ?

In this sentence “Root” and “Leaf” are used as adjectives for sites, so AFAIK, they should remain singular.

   sites albeit for different EVIs.

                   +---------+            +---------+
                   |   PE1   |            |   PE2   |
    +---+          |  +---+  |  +------+  |  +---+  |            +---+
    |CE1+---ES1----+--+   |  |  | MPLS |  |  |   +--+----ES2-----+CE2|
    +---+  (Root)  |  |MAC|  |  |  /IP |  |  |MAC|  |   (Leaf)   +---+
                   |  |VRF|  |  |      |  |  |VRF|  |
                   |  |   |  |  |      |  |  |   |  |            +---+
                   |  |   |  |  |      |  |  |   +--+----ES3-----+CE3|
                   |  +---+  |  +------+  |  +---+  |   (Leaf)   +---+
                   +---------+            +---------+

   Figure 1: Scenario 1


   In such scenario, an EVPN PE implementation MAY provide E-TREE
   service using topology constraint among the PEs belonging to the same

"topology constraint" is a bit opaque as a term, perhaps "using tailored BGP RT import/export policies" would be more descriptive (assuming I understood your intent)

Done. Changed it to “topology constraint tailored by BGP Route Target (RT) import/export policies"

   EVI. The purpose of this topology constraint is to avoid having PEs
   with only  Leaf sites importing and processing BGP MAC routes from
   each other. To support such topology constrain in EVPN, two BGP
   Route-Targets (RTs) are used for every EVPN Instance (EVI): one RT is
   associated with the Root sites and the other is associated with the
   Leaf sites. On a per EVI basis, every PE exports the single RT
   associated with its type of site(s). Furthermore, a PE with Root
   site(s) imports both Root and Leaf RTs, whereas a PE with Leaf
   site(s) only imports the Root RT.

The text seems to imply that the above is sufficient to deliver the service, but I fail to see what would prevent Leaf-to-Leaf traffic between Leaves bound to the same MAC-VRF (ES2 and ES3 in firgure1).  Shouldn't the text mention the use of a split-horizon in Leaf MAC-VRFs ?

Agree, nice catch!. I changed the first sentence from:
"In such scenario, an EVPN PE implementation MAY provide E-TREE service using topology constraint among the PEs belonging to the same EVI."
TO
"In such scenario, topology constraint, provided by BGP Route Target (RT) import/export policies among the PEs belonging to the same EVI, can be used to restrict the communications among Leaf PEs."


(assuming the previous point is resolved:)

With this mechanism above, isn't it possible to have on a given PE, for a single E-TREE EVI, both Leaves and Roots, as long as distinct MAC-VRFs are used (one for Leaves and one for Roots) ?   (it seems to me that the assymetric import/export RT would do what is needed to build an E-TREE, we would just have a particular case where a Leaf MAC-VRF and a Root MAC-VRF for a given E-TREE end up on a single PE)

That’s not possible because per definition of an EVI, there is only a single MAC-VRF per EVI for a PE. Besides, I don’t understand what good does it do to have two MAC-VRFs on the same PE (one for Leafs and another for Roots) because Leafs and Roots need to talk to each other and thus we want them to be in the same MAC-VRF. However, Leafs should not talk among themselves and thus we can put all the Leaf ACs in a split-horizon group.


If this is not possible, I think the text should explain why.

I don’t think we need an explanation because of the above reason but if you think otherwise, then please suggest a text as what do you think I should add.

If the number of EVIs is very large
   (e.g., more than 64K), then RT type 0 as defined in [RFC4360] SHOULD
   be used; otherwise, RT type 2 is sufficient [RFC7153].

This is not specific to E-VPN E-TREE, or even to E-VPN, why mention that ?

Agree, it is not specific to E-TREE. I will take it out.

2.2 Scenario 2: Leaf OR Root site(s) per AC

   In this scenario, a PE receives traffic from either Root OR Leaf
   sites (but not both) on a given Attachment Circuit (AC) of an EVI. In
   other words, an AC (ES or ES/VLAN) is either associated with a Root
   or Leaf (but not both).

s/with a Root or Leaf/with Roots or Leaves/ ?

Agree – Changed it to "Root(s) or Leaf(s)"


                     +---------+            +---------+
                     |   PE1   |            |   PE2   |
    +---+            |  +---+  |  +------+  |  +---+  |            +---+
    |CE1+-----ES1----+--+   |  |  |      |  |  |   +--+---ES2/AC1--+CE2|
    +---+    (Leaf)  |  |MAC|  |  | MPLS |  |  |MAC|  |   (Leaf)   +---+
                     |  |VRF|  |  |  /IP |  |  |VRF|  |
                     |  |   |  |  |      |  |  |   |  |            +---+
                     |  |   |  |  |      |  |  |   +--+---ES2/AC2--+CE3|
                     |  +---+  |  +------+  |  +---+  |   (Root)   +---+
                     +---------+            +---------+

   Figure 2: Scenario 2

   In this scenario, if there are PEs with only root (or leaf) sites per
   EVI, then the RT constrain procedures described in section 2.1 can
   also be used here. However, when a Root site is added to a Leaf PE,
   then that PE needs to process MAC routes from all other Leaf PEs and
   add them to its forwarding table.

This is the case in 2.1 as well, isn't it ?

It can start as 2.1 but as soon as you add Root site to a Leaf PE, then it becomes different (per last sentence of the above para).

For this scenario, if for a given
   EVI, the majority of PEs will eventually have both Leaf and Root
   sites attached, even though they may start as Root-only or Leaf-only
   PEs, then it is recommended to use a single RT per EVI and avoid
   additional configuration and operational overhead.

Why this recommendation ?
Even with a majority of PEs having both Leaves and Roots, there can remain (up to 49% of) PEs having only Leaves, which will uselessly have all routes to other Leaves.

So "it is recommended" above, deserves to be explained more, I think.

OK, I changed “majority” to “vast majority” :-)


2.3 Scenario 3: Leaf OR Root site(s) per MAC

   In this scenario, a PE may receive traffic from both Root AND Leaf
   sites on a given Attachment Circuit (AC) of an EVI. Since an
   Attachment Circuit (ES or ES/VLAN) carries traffic from both Root and
   Leaf sites, the granularity at which Root or Leaf sites are
   identifies

s/identifies/identified/

Done.

is on a per MAC address. This scenario is considered in
   this draft for EVPN service with only known unicast traffic - i.e.,
   there is no BUM traffic.

"there is no BUM" is quite a bold claim ! :=

Maybe the text should say "no BUM traffic is supported (BUM traffic will be dropped)" ?

(possibly "BUM traffic from Leaves will be dropped" would be sufficient ?)

Changed it to “BUM traffic is not supported in this scenario and it is dropped”.

                     +---------+            +---------+
                     |   PE1   |            |   PE2   |
    +---+            |  +---+  |  +------+  |  +---+  |            +---+
    |CE1+-----ES1----+--+   |  |  |      |  |  |   +--+---ES2/AC1--+CE2|
    +---+    (Root)  |  | E |  |  | MPLS |  |  | E |  | (Leaf/Root)+---+
                     |  | V |  |  |  /IP |  |  | V |  |
                     |  | I |  |  |      |  |  | I |  |            +---+
                     |  |   |  |  |      |  |  |   +--+---ES2/AC2--+CE3|
                     |  +---+  |  +------+  |  +---+  |   (Leaf)   +---+
                     +---------+            +---------+

   Figure 3: Scenario 3

3 Operation for EVPN

   [RFC7432] defines the notion of ESI MPLS label used for split-horizon
   filtering of BUM traffic at the egress PE. Such egress filtering
   capabilities can be leveraged in provision of E-TREE services as seen
   shortly. In other words, [RFC7432] has inherent capability to support
   E-TREE services without defining any new BGP routes but by just
   defining a new BGP Extended Community for leaf indication as shown
   later in this document.

3.1 Known Unicast Traffic

   Since in EVPN, MAC learning is performed in control plane via
   advertisement of BGP routes, the filtering needed by E-TREE service
   for known unicast traffic can be performed at the ingress PE, thus
   providing very efficient filtering and avoiding sending known unicast
   traffic over MPLS/IP core to be filtered at the egress PE as done in
   traditional E-TREE solutions (e.g., E-TREE for VPLS).

   To provide such ingress filtering for known unicast traffic, a PE
   MUST indicate to other PEs what kind of sites (root or leaf) its MAC
   addresses are associated with by advertising a leaf indication flag
   (via an Extended Community) along with each of its MAC/IP
   Advertisement route. The lack of such flag indicates that the MAC
   address is associated with a root site.



  This scheme applies to all
   scenarios described in section 2.

   Furthermore, for multi-homing scenario of section 2.2, where an AC is
   either root or leaf (but not both), the PE MAY advertise leaf
   indication along with the Ethernet A-D per EVI route. This
   advertisement is used for sanity checking in control-plane to ensure
   that there is no discrepancy in configuration among different PEs of
   the same redundancy group. For example, if a leaf site is multi-homed
   to PE1 an PE2, and PE1 advertises the Ethernet A-D per EVI
   corresponding to this leaf site with the leaf-indication flag but PE2
   does not, then the receiving PE notifies the operator of such
   discrepancy and ignore the leaf-indication flag on PE1. In other
   words, in case of discrepancy, the multi-homing for that pair of PEs
   is assumed to be in default "root" mode for that <ESI, EVI> or <ESI,
   EVI/VLAN>. The leaf indication flag on Ethernet A-D per EVI route
   tells the receiving PEs that all MAC addresses associated with this
   <ESI, EVI> or <ESI, EVI/VLAN> are from a leaf site. Therefore, if a
   PE receives a leaf indication for an AC via the Ethernet A-D per EVI
   route but doesn't receive a leaf indication in the corresponding MAC
   route,

"MAC route" --> "MAC/IP Advertisement route" ?

I changed it to “MAC/IP Advertisement route”.

then it notify the operator and ignore the leaf indication on

s/notify/notifies/

Done.
   the Ethernet A-D per EVI route.


The procedure above should I think be rephrased to provide unambiguous interpretation in the case where a given MAC is being announced in more than one MAC/IP advertisement route, possibly carrying a different leaf indication (and even possibly from different ESes, or from PEs not advertising Ethernet A-D route).

Are you talking about MAC move where a MAC can move between Root and Leaf sites? If so, MAC mobility procedure takes precedence. I have added the following paragraph toward the end of this section:
"In situation where MAC moves are allowed among Leaf and Root sites (e.g., non-static MAC), PEs can receive multiple MAC/IP advertisements routes for the same MAC address with different Leaf/Root indications (and possibly different ESIs for multi-homing scenarios). In such situations, MAC mobility procedures take precedence to first identify the location of the MAC before associating that MAC with a Root or a Leaf site."


   Tagging MAC addresses with a leaf indication enables remote PEs to
   perform ingress filtering for known unicast traffic - i.e., on the
   ingress PE, the MAC destination address lookup yields, in addition to
   the forwarding adjacency, a flag which indicates whether the target
   MAC is associated with a Leaf site or not.

Ditto, more or less: the procedure above should I think be rephrased to provide unambiguous interpretation in the case where a given MAC is being announced in more than one MAC/IP advertisement route, possibly carrying a different leaf indication.

The new paragraph will take care of it.

The ingress PE cross-
   checks this flag with the status of the originating AC, and if both
   are Leafs, then the packet is not forwarded.




   To support the above ingress filtering functionality, a new E-TREE
   Extended Community with a Leaf indication flag is introduced [section
   5.2]. This new Extended Community MUST be advertised with MAC/IP
   Advertisement route and MAY be advertised with an Ethernet A-D per
   EVI route as described above.

3.2 BUM Traffic

   For BUM traffic, it is not possible to perform filtering on the
   ingress PE, as is the case with known unicast, because of the multi-
   destination nature of the traffic.

Saying "it is not possible" without more explanation is not very useful (the reader may think about using RPF-like techniques on the egress PE).
It seems to me more reasonable to formulate things in terms of "This specification does not provide support for filtering BUM traffic on the ingress PE", and avoid a sentence like the one above.

OK, Changed the sentence to:
"This specification does not provide support for filtering BUM traffic on the ingress PE because it is not possible to perform filtering of BUM traffic on the ingress PE, as is the case with known unicast described above, due to the multi-destination nature of BUM traffic."


As such, the solution relies on
   egress filtering. In order to apply the proper egress filtering,
   which varies based on whether a packet is sent from a Leaf AC or a
   root AC, the MPLS-encapsulated frames MUST be tagged with an
   indication when they originated from a Leaf AC. In other words, leaf
   indication for BUM traffic is done at the granularity of AC. This can
   be achieved in EVPN through the use of a MPLS label where it can be
   used to either identify the Ethernet segment of origin per [RFC7432]
   (i.e., ESI label) or it can be used to indicate that the packet is
   originated from a leaf site (Leaf label).

   BUM traffic sent over a P2MP LSP or ingress replication, may need to
   carry an upstream assigned or downstream assigned MPLS label
   (respectively) for the purpose of egress filtering to indicate to the
   egress PEs whether this packet is originated from a leaf AC.

   The main difference between downstream and upstream assigned MPLS
   label is that in case of downstream assigned not all egress PE
   devices need to receive the label just like ingress replication
   procedures defined in [RFC7432].

   There are four scenarios to consider as follow.

s/follow/follows/
Done.

In all these
   scenarios, the imposition PE imposes the right MPLS label associated

s/the imposition PE imposes/the ingress PE imposes/ ?
Done.
   with the originated Ethernet Segment (ES) depending on whether the
   Ethernet frame originated from a Root or a Leaf site on that Ethernet
   Segment (ESI or Leaf label).

The mechanism by which the PE identifies
   whether a given frame originated from a Root or a Leaf site on the
   segment is based on the Ethernet Tag associated with the frame (e.g.,
   whether the frame received on a leaf or a root AC).

First comment: it seems that the formulation should also support the case where an AC does not use .1q.

Agree. Change the sentence to:
"The mechanism by which the PE identifies whether a given frame originated from a Root or a Leaf site on the segment is based on the AC identifier for that segment (e.g., Ethernet Tag of the frame for 802.1Q frames). Other mechanisms for identifying root or leaf (e.g., on a per MAC address basis) is beyond the scope of this document."


(side comment: doing the identification based on the source MAC address would seem to allow BUM in the context of 2.3; it is out of the scope of my review to extend the scope of these specs, but I'm curious why it is not proposed....)

If we went for per MAC root/leaf identification, then this would have expanded the scope of DF election and egress filtering beyond that of RFC 7432. Currently, we don’t have any such requirements from operators and service providers.
Other mechanisms
   for identifying whether an ingress AC is a root or leaf is beyond the
   scope of this document.

3.2.1 BUM traffic originated from a single-homed site on a leaf AC

   In this scenario, the ingress PE adds a special MPLS label indicating
   a Leaf site. This special Leaf MPLS label, used for single-homing
   scenarios, is not on a per ES basis but rather on a per PE basis -
   i.e., a single Leaf MPLS label is used for all single-homed ES's on
   that PE. This Leaf label is advertised to other PE devices, using a
   new EVPN Extended Community called E-TREE Extended Community (section
   5.1) along with an Ethernet A-D per ES route with ESI of zero and a
   set of Route Targets (RTs) corresponding to all EVIs on the PE with
   at least one leaf site per EVI. The set of Ethernet A-D per ES routes
   may be needed if the number of Route Targets (RTs) that need to be
   sent exceed the limit on a single route per [RFC7432]. The ESI for
   the Ethernet A-D per ES route is set to zero to indicate single-homed
   sites.

   When a PE receives this special Leaf label in the data path, it
   blocks the packet if the destination AC is of type Leaf; otherwise,
   it forwards the packet.

3.2.2 BUM traffic originated from a single-homed site on a root AC

   In this scenario, the ingress PE does not add any ESI or Leaf label
   and it operates per [RFC7432] procedures.

3.2.3 BUM traffic originated from a multi-homed site on a leaf AC

   In this scenario, it is assumed that While different ACs (VLANs) on

s/While/while/
Done.
   the same ES could have different root/leaf designation (some being
   roots and some being leaves), the same VLAN does have the same
   root/leaf designation on all PEs on the same ES. Furthermore, it is
   assumed that there is no forwarding among subnets - ie, the service
   is EVPN L2 and not EVPN IRB. IRB use case is outside the scope of
   this document.

   In such scenarios,  If a multicast packet is originated from a leaf

s/multicast/multicast or broadcast/ ?
Done.
   AC, then it only needs to carry Leaf label described in section
   3.2.1. This label is sufficient in providing the necessary egress
   filtering of BUM traffic from getting sent to leaf ACs including the
   leaf AC on the same Ethernet Segment.

3.2.4 BUM traffic originated from a multi-homed site on a root AC

   In this scenario, both the ingress and egress PE devices follows the
   procedure defined in [RFC7432] for adding and/or processing an ESI
   MPLS label.

3.3 E-TREE Traffic Flows for EVPN

   Per [RFC7387], a generic E-Tree service supports all of the following
   traffic flows:

        - Ethernet Unicast from Root to Roots & Leaf
        - Ethernet Unicast from Leaf to Root
        - Ethernet Broadcast/Multicast from Root to Roots & Leafs
        - Ethernet Broadcast/Multicast from Leaf to Roots

   A particular E-Tree service may need to support all of the above
   types of flows or only a select subset, depending on the target
   application. In the case where unicast flows need not be supported,
   the L2VPN PEs can avoid performing any MAC learning function.

   In the subsections that follow, we will describe the operation of
   EVPN to support E-Tree service with and without MAC learning.

3.3.1 E-Tree with MAC Learning

   The PEs implementing an E-Tree service must perform MAC learning when
   unicast traffic flows must be supported among Root and Leaf sites. In
   this case, the PE with Root sites performs MAC learning in the data-

s/the PE with Root sites/the PEs with Root sites/ ?
(or "PE(s)"...)
Done.
   path over the Ethernet Segments, and advertises reachability in EVPN
   MAC Advertisement routes. These routes will be imported by all PEs
   for that EVI (i.e., PEs that have Leaf sites as well as PEs that have
   Root sites). Similarly, the PEs with Leaf sites perform MAC learning
   in the data-path over their Ethernet Segments, and advertise
   reachability in EVPN MAC Advertisement routes. For the scenario
   described in section 2.1 (or possibly section 2.2), these routes are
   imported only by PEs with at least one Root site in the EVI - i.e., a
   PE with only Leaf sites will not import these routes. PEs with Root
   and/or Leaf sites may use the Ethernet A-D routes for aliasing (in
   the case of multi-homed segments) and for mass MAC withdrawal per
   [RFC7432].

   To support multicast/broadcast from Root to Leaf sites, either a P2MP
   tree rooted at the PE(s) with the Root site(s) or ingress replication
   can be used. The multicast tunnels are set up through the exchange of
   the EVPN Inclusive Multicast route, as defined in [RFC7432].

   To support multicast/broadcast from Leaf to Root sites, ingress
   replication should be sufficient for most scenarios where there are
   only a few Roots (typically two). Therefore, in a typical scenario, a
   root PE needs to support both a P2MP tunnel in transmit direction
   from itself to leaf PEs and at the same time it needs to support
   ingress-replication tunnels in receive direction from leaf PEs to
   itself. In order to signal this efficiently from the root PE, a new
   composite tunnel type is defined per section 5.3.  This new composite
   tunnel type is advertised by the root PE to simultaneously indicate a
   P2MP tunnel in transmit direction and an ingress-replication tunnel
   in the receive direction for the BUM traffic.

   If the number of Roots is large, P2MP tunnels originated at the PEs
   with Leaf sites may be used and thus there will be no need to use the
   modified PMSI tunnel attribute in section 5.2 for composite tunnel
   type.

3.3.2 E-Tree without MAC Learning

   The PEs implementing an E-Tree service need not perform MAC learning
   when the traffic flows between Root and Leaf sites are only multicast
   or broadcast. In this case, the PEs do not exchange EVPN MAC
   Advertisement routes. Instead, the Inclusive Multicast Ethernet Tag
   (IMET) routes are used to support BUM traffic.

It's nicer to avoid "acronym soup" when possible.
Here the acronym is used only once... Using  "Inclusive Multicast Ethernet Tag route" in these two places is ok I think.
Done.

   The fields of the IMET route are populated per the procedures defined
   in [RFC7432], and the multicast tunnel setup criteria are as
   described in the previous section.

   Just as in the previous section, if the number of PEs with root sites
   are only a few and thus ingress replication is desired from leaf PEs
   to these root PEs, then the modified PMSI attribute as defined in
   section 5.3 should be used.

4 Operation for PBB-EVPN

   In PBB-EVPN, the PE advertises a Root/Leaf indication along with each
   B-MAC Advertisement route, to indicate whether the associated B-MAC
   address corresponds to a Root or a Leaf site. Just like the EVPN
   case, the new E-TREE Extended Community defined in section [5.1] is
   advertised with each MAC Advertisement route.

   In the case where a multi-homed Ethernet Segment has both Root and
   Leaf sites attached, two B-MAC addresses are advertised: one B-MAC
   address is per ES as specified in [RFC7623] and implicitly denoting
   Root, and the other B-MAC address is per PE and explicitly denoting
   Leaf. The former B-MAC address is not advertised with the E-TREE
   extended community but the latter B-MAC denoting Leaf is advertised
   with the new E-TREE extended community where "Leaf-indication" flag
   is set. In such multi-homing scenarios where and Ethernet Segment has
   both Root and Leaf ACs, it is assumed that While different ACs
   (VLANs) on the same ES could have different root/leaf designation
   (some being roots and some being leaves), the same VLAN does have the
   same root/leaf designation on all PEs on the same ES. Furthermore, it
   is assumed that there is no forwarding among subnets - ie, the
   service is L2 and not IRB. IRB use case is outside the scope of this
   document.

   The ingress PE uses the right B-MAC source address depending on
   whether the Ethernet frame originated from the Root or Leaf AC on
   that Ethernet Segment. The mechanism by which the PE identifies
   whether a given frame originated from a Root or Leaf site on the
   segment is based on the Ethernet Tag associated with the frame. Other
   mechanisms of identification, beyond the Ethernet Tag, are outside
   the scope of this document.

   Furthermore, a PE advertises two special global B-MAC addresses: one
   for Root and another for Leaf, and tags the Leaf one as such in the
   MAC Advertisement route. These B-MAC addresses are used as source
   addresses for traffic originating from single-homed segments. The B-
   MAC address used for indicating Leaf sites can be the same for both
   single-homed and multi-homed segments.

4.1 Known Unicast Traffic

   For known unicast traffic, the PEs perform ingress filtering: On the
   ingress PE, the C-MAC destination address lookup yields, in addition
   to the target B-MAC address and forwarding adjacency, a flag which
   indicates whether the target B-MAC is associated with a Root or a
   Leaf site. The ingress PE cross-checks this flag with the status of
   the originating site, and if both are a Leaf, then the packet is not
   forwarded.

4.2 BUM Traffic

   For BUM traffic, the PEs must perform egress filtering. When a PE
   receives a MAC advertisement route (which will be used as a source B-
   MAC), it updates its Ethernet Segment egress filtering function

The "its Ethernet Segment egress filtering function" phrase makes it sounds like we're talking about a wellknown function defined somewhere.
If this is indeed the case, providing a reference would be in order.
If not, then explaining what this function is would be required.

Changed the sentence to:
"When a PE receives a MAC advertisement route (which will be used as a source B-MAC for BUM traffic), it updates its egress filtering (based on the source B-MAC address), as follows:"

(Are you talking about doing something similar to what 3.2 specifies for the non-PBB procedures ?)
Correct. Similar to 3.2 but based on B-MAC address.

   (based on the source B-MAC address), as follows:

   - If the MAC Advertisement route indicates that the advertised B-MAC
   is a Leaf, and the local Ethernet Segment is a Leaf as well, then the
   source B-MAC address is added to the B-MAC filtering list.
Changed it to:
“… is added to its B-MAC list used for egress filtering."

Implicitly we can guess that this "filtering list" is a list of things to include, rather than a list of things to include, but the text should I think be explicit.

Changed it as above.



   - Otherwise, the B-MAC filtering list is not updated.

   When the egress PE receives the packet, it examines the B-MAC source
   address to check whether it should filter or forward the frame. Note
   that this uses the same filtering logic as baseline [RFC7623] and
   does not require any additional flags in the data-plane.

   The PE places all Leaf Ethernet Segments of a given bridge domain in
   a single split-horizon group in order to prevent intra-PE forwarding
   among Leaf segments.

As per a previous comment: isn't this something that should be common between the EVPN and the PBB-EVPN procedures ?
Last paragraph is the same. So, I changed it to:
“Just as in section 3.2, the PE palaces all Leaf Ethernet Segments …."


This split-horizon function applies to BUM
   traffic.

Does it mean "only" ?
(if yes, please be explicit, and if no, adding "as well" or something like that would be better)
It applies to both known unicast and BUM. So, I am adding “as well” to the sentence:



4.3 E-Tree without MAC Learning

   In scenarios where the traffic of interest is only Multicast and/or
   broadcast, the PEs implementing an E-Tree service do not need to do
   any MAC learning. In such scenarios the filtering must be performed
   on egress PEs. For PBB-EVPN, the handling of such traffic is per
   section 4.2 without C-MAC learning part of it at both ingress and
   egress PEs.

5 BGP Encoding

   This document defines two new BGP Extended Community for EVPN.

5.1 E-TREE Extended Community

   This Extended Community is a new transitive Extended Community having
   a Type field value of 0x06 (EVPN) and the Sub-Type 0x05. It is used
   for leaf indication of known unicast and BUM traffic. For BUM
   traffic, the Leaf Label field is set to a valid MPLS label and this
   EC is advertised along with Ethernet A-D per ES route with an ESI of
   zero to enable egress filtering on disposition PEs per section 3.2.1
   and 3.2.3. There is no need to send ESI Label Extended Community when
   sending Ethernet A-D per ES route with an ESI of zero. For known
   unicast traffic, the Leaf flag bit is set to one and this EC is
   advertised along with MAC/IP Advertisement route per section 3.1.

   The E-TREE Extended Community is encoded as an 8-octet value as
   follows:

        0                   1                   2                   3
        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       | Type=0x06     | Sub-Type=0x05 | Flags(1 Octet)|  Reserved=0   |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
       |  Reserved=0   |           Leaf Label                          |
       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

   The low-order bit of the Flags octet is defined as the "Leaf-
   Indication" bit. A value of one indicates a Leaf AC/Site.

   When this EC is advertised along with MAC/IP Advertisement route (for
   known unicast traffic), the Leaf-Indication flag MUST be set to one
   and Leaf Label is set to zero. The received PE should ignore Leaf
   Label and only processes Leaf-Indication flag. A value of zero for
   Leaf-Indication flag is invalid when sent along with MAC/IP
   advertisement route and an error should be logged.

   When this EC is advertised along with Ethernet A-D per ES route (with
   ESI of zero) for BUM traffic, the Leaf Label MUST be set to a valid
   MPLS label and the Leaf-Indication flag should be set to zero. The
   received PE should ignore the Leaf-Indication flag. A non-valid MPLS
   label when sent along with the Ethernet A-D per ES route, should be
   logged as an error.

5.2 PMSI Tunnel Attribute

   [RFC6514] defines PMSI Tunnel attribute which is an optional
   transitive attribute with the following format:

         +---------------------------------+
         |  Flags (1 octet)                |
         +---------------------------------+
         |  Tunnel Type (1 octets)         |
         +---------------------------------+
         |  MPLS Label (3 octets)          |
         +---------------------------------+
         |  Tunnel Identifier (variable)   |
         +---------------------------------+

   This draft uses all the fields per existing definition except for the
   following modifications to the Tunnel Type and Tunnel Identifier:

   When receiver ingress-replication label is needed, the high-order bit
   of the tunnel type field (C bit - Composite tunnel bit) is set while
   the remaining low-order seven bits indicate the tunnel type as
   before. When this C bit is set, the "tunnel identifier" field would
   begin with a three-octet label, followed by the actual tunnel
   identifier for the transmit tunnel.  PEs that don't understand the
   new meaning of the high-order bit would treat the tunnel type as an
   invalid tunnel type. For the PEs that do understand the new meaning
   of the high-order, if ingress replication is desired when sending BUM
   traffic, the PE will use the the label in the Tunnel Identifier field
   when sending its BUM traffic.

I think we should have some text to unambiguously state that: "Using the Composite flag for Tunnel Types 0x00 'no tunnel information present' and 0x06 'Ingress Replication' is invalid, and should be treated as an invalid tunnel type on reception".

Done.

Additionally, since RFC7385 has created a registry for PMSI Tunnel attribute tunnel types, taking the most significant bit from this field can't be done without a significant change of how this registry is organized  (because now you can't take value in 0x7b-0x7f without colliding into values which are Experimental or Reserved).

Achieving the above requires an update of RFC7385, so I would suggest adding an 8.1 section saying this:

---
The "P-Multicast Service Interface Tunnel (PMSI Tunnel) Tunnel Types" registry in the "Border Gateway Protocol (BGP) Parameters" registry needs to be updated to reflect the use of the most significant bit to advertise the use of "composite tunnels" (section 5.2).

For this purpose, this document updates RFC7385.

The registry is to be updated, by removing the entries for 0xFB-0xFE and 0x0F, and replacing them by:
- 0x7B-0x7E Reserved for Experimental Use [this document]
- 0x7F  Reserved [this document]
- 0x80-0xFF Not Allocatable, corresponds to Composite tunnel types [this document]

The allocation policy for values 0x00 to 0x7A is IETF Review [RFC5226<https://tools.ietf.org/html/rfc5226>].
The range for experimental use is now 0x7B-0x7E, and value in this range are not to be assigned.
The status of 0x7F may only be changed through Standards Action [RFC5226<https://tools.ietf.org/html/rfc5226>].

Done. Thanks for providing the text. It was very helpful.

----

Additionally to this section 8.1:
- the document header needs to specify "Updates: RFC7385" .
- the following can be added to the abstract (because reviewer often want to see in the abstract the explanation for why an RFC is updated): "This document makes use of the most significant bit of the scope governed by the IANA registry created by RFC7385, and hence updates that RFC accordingly."

Done.


6  Acknowledgement

   We would like to thank Dennis Cai, Antoni Przygienda, and Jeffrey
   Zhang for their valuable comments.

7  Security Considerations

   Since this draft uses the EVPN constructs of [RFC7432] and [RFC7623],
   the same security considerations in these drafts are also applicable
   here. Furthermore, this draft provides additional security check by
   allowing sites (or ACs) of an EVPN instance to be designated as
   "Root" or "Leaf" and preventing any traffic exchange among "Leaf"
   sites of that VPN through ingress filtering for known unicast traffic
   and egress filtering for BUM traffic.

8  IANA Considerations

   This document requests the allocation of value 5 in the "EVPN
   Extended Community Sub-Types" registry defined in [RFC7153] and
   modification of the registry as follow:

The text above should be reformulated to reflect the fact that IANA has already allocated the value: "IANA has allocated value 5 in the "EVPN Extended Community Sub-Types" registry defined in [RFC7153] as follow:"


         SUB-TYPE VALUE     NAME                        Reference
         0x05               E-TREE Extended Community   This document

Done.


9  References

9.1  Normative References

   [KEYWORDS] Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC7432] Sajassi et al., "BGP MPLS Based Ethernet VPN", February,
              2015.

   [RFC7623] Sajassi et al., "Provider Backbone Bridging Combined with
              Ethernet VPN (PBB-EVPN)", September, 2015.

RFC7385 will have to be added to the list.
Done.

9.2  Informative References

   [RFC7387] Key et al., "A Framework for E-Tree Service over MPLS
   Network", October 2014.





   [RFC7153] Rosen et al., "IANA Registries for BGP Extended
   Communities",  March, 2014.

This should be I think under Normative References.
Done

   [RFC6514] Aggarwal et al., "BGP Encodings and Procedures for
   Multicast in MPLS/BGP IP VPNs",  February, 2012.

Ditto.
Done


Thanks,

-Thomas