Re: [bess] shepherd review of draft-ietf-bess-evpn-etree

Thomas Morin <thomas.morin@orange.com> Tue, 06 December 2016 09:55 UTC

Return-Path: <thomas.morin@orange.com>
X-Original-To: bess@ietfa.amsl.com
Delivered-To: bess@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id B2DB8129F26; Tue, 6 Dec 2016 01:55:34 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.129
X-Spam-Level:
X-Spam-Status: No, score=-4.129 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RP_MATCHES_RCVD=-2.896, SPF_SOFTFAIL=0.665] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id st72UCsa0KxQ; Tue, 6 Dec 2016 01:55:30 -0800 (PST)
Received: from p-mail2.rd.orange.com (p-mail2.rd.orange.com [161.106.1.3]) by ietfa.amsl.com (Postfix) with ESMTP id E8A4E129F23; Tue, 6 Dec 2016 01:55:28 -0800 (PST)
Received: from p-mail2.rd.orange.com (localhost.localdomain [127.0.0.1]) by localhost (Postfix) with SMTP id E4E9EE3007C; Tue, 6 Dec 2016 10:55:27 +0100 (CET)
Received: from FTRDCH01.rd.francetelecom.fr (unknown [10.194.32.11]) by p-mail2.rd.orange.com (Postfix) with ESMTP id CD7CDE30078; Tue, 6 Dec 2016 10:55:27 +0100 (CET)
Received: from [10.193.71.12] (10.193.71.12) by FTRDCH01.rd.francetelecom.fr (10.194.32.11) with Microsoft SMTP Server id 14.3.301.0; Tue, 6 Dec 2016 10:55:27 +0100
From: Thomas Morin <thomas.morin@orange.com>
To: "Ali Sajassi (sajassi)" <sajassi@cisco.com>, "draft-ietf-bess-evpn-etree@ietf.org" <draft-ietf-bess-evpn-etree@ietf.org>, Loa Andersson <loa@pi.nu>, "George Swallow -T (swallow - MBO PARTNERS INC at Cisco)" <swallow@cisco.com>, Eric Rosen <erosen@juniper.net>, BESS <bess@ietf.org>
References: <3323ddae-c96f-49a4-2dec-1bfc4ed857dc@orange.com> <D3EA14B3.1B9CAE%sajassi@cisco.com> <6cb41698-b98b-ecbf-9e34-660771bd3fb8@orange.com> <D42D4E86.1BE849%sajassi@cisco.com>
Organization: Orange
Message-ID: <0b846411-4526-c6d3-3ea4-87ebd90de953@orange.com>
Date: Tue, 06 Dec 2016 10:55:27 +0100
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.5.1
MIME-Version: 1.0
In-Reply-To: <D42D4E86.1BE849%sajassi@cisco.com>
Content-Type: multipart/alternative; boundary="------------C32DDCF9198071689B8A4AD0"
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/kcjYVS5AzQvNlhL40P_YRPf5Fg0>
Cc: Martin Vigoureux <martin.vigoureux@nokia.com>
Subject: Re: [bess] shepherd review of draft-ietf-bess-evpn-etree
X-BeenThere: bess@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bess>, <mailto:bess-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess/>
List-Post: <mailto:bess@ietf.org>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bess>, <mailto:bess-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Dec 2016 09:55:35 -0000

Hi Ali,

Ali Sajassi (sajassi):
> Thanks again for your additional comments. Below, please find my 
> comment resolutions. Let me know please if there are any further 
> comments.

Answers below...


>>>    EVI. The purpose of this topology constraint is to avoid having PEs
>>>    with only  Leaf sites importing and processing BGP MAC routes from
>>>    each other. To support such topology constrain in EVPN, two BGP
>>>    Route-Targets (RTs) are used for every EVPN Instance (EVI): one RT is
>>>    associated with the Root sites and the other is associated with the
>>>    Leaf sites. On a per EVI basis, every PE exports the single RT
>>>    associated with its type of site(s). Furthermore, a PE with Root
>>>    site(s) imports both Root and Leaf RTs, whereas a PE with Leaf
>>>    site(s) only imports the Root RT.
>>
>> The text seems to imply that the above is sufficient to deliver the 
>> service, but I fail to see what would prevent Leaf-to-Leaf traffic 
>> between Leaves bound to the same MAC-VRF (ES2 and ES3 in firgure1). 
>> Shouldn't the text mention the use of a split-horizon in Leaf MAC-VRFs ?
>>
>>
>> Agree, nice catch!. I changed the first sentence from:
>> "In such scenario, an EVPN PE implementation MAY provide E-TREE 
>> service using topology constraint among the PEs belonging to the same 
>> EVI."
>> TO
>> "In such scenario, topology constraint, provided by BGP Route Target 
>> (RT) import/export policies among the PEs belonging to the same EVI, 
>> can be used to restrict the communications among Leaf PEs."
>
> The sentence above does not address my question in fact, which was 
> about communication between Leaf ACs (rather than about communication 
> between Leaf PEs)
> Let me restate here, more clearly: I fail to see what would prevent 
> Leaf-to-Leaf traffic between **ACs** bound to the same MAC-VRF (ES2 
> and ES3 in firgure1).  Shouldn't the text mention the use of a 
> split-horizon in Leaf MAC-VRFs ?
>
> OK. I mentioned the use of split-horizon filtering explicitly for 
> blocking inter-Leaf communication within the same PE.
>
> "In such scenario, using tailored BGP Route Target (RT) import/export 
> policies among the PEs belonging to the same EVI, can be used to 
> restrict the communications among Leaf PEs. To restrict the 
> communications among leaf sites connected to the same PE  and 
> belonging to the same EVI, split-horizon filtering is used - i.e., the 
> interfaces associated with Leaf sites are placed in the same 
> split-horizon group. "
>

Mentioning this here is an improvement.
Perhaps a reference is needed to explain what a split-horizon _group_ is 
though, or simply rephrase to avoid the notion ?

Proposal: "split-horizon is used to block traffic from one Leaf 
interface to another Leaf interface of a given E-TREE EVI".

>>
>> (assuming the previous point is resolved:)
>>
>> With this mechanism above, isn't it possible to have on a given PE, 
>> for a single E-TREE EVI, both Leaves and Roots, as long as distinct 
>> MAC-VRFs are used (one for Leaves and one for Roots) ?   (it seems to 
>> me that the assymetric import/export RT would do what is needed to 
>> build an E-TREE, we would just have a particular case where a Leaf 
>> MAC-VRF and a Root MAC-VRF for a given E-TREE end up on a single PE)
>>
>>
>> That’s not possible because per definition of an EVI, there is only a 
>> single MAC-VRF per EVI for a PE.
>
> Where can I read such a definition ? (the Terminology section in 
> RFC7432 does not say that, unless I'm missing something).
> And that seems a completely arbitrary restriction.
> (just thinking that a given PE device can be split in two logical 
> devices show that it can work)
>
> Section 6 of RFC7432 where it gives definitions for different service 
> interface types, it specifies the relationship between MAC-VRF and 
> VLAN (bridge table) and how many MAC-VRF (and bridge tables) can be 
> per EVI.

This section of RFC7434 discusses many different things for the 
different variants.
Can you provide a specific pointer about "how many MAC-VRFs can be per 
EVI" ?

> In bridging world, there can only be a single bridge table per VLAN in 
> a device.

I still don't find here anything that would preclude having, on a given 
PE, for a given E-TREE EVI, one Leaves MAC-VRF and one Roots MAC-VRF: 
can't these two MAC-VRFs use different internal VLANs (with translation 
if the external VLANs are constrained).


>
>> Besides, I don’t understand what good does it do to have two MAC-VRFs 
>> on the same PE (one for Leafs and another for Roots)
>
> Well, the "what is good for" is pretty simple: it means you can have, 
> just by tailoring the import/export policies like in 2.1, something as 
> useful as the scenario in 2.2.
>
> There can only be a single bridge table per VLAN. Now even if you add 
> some kind of logic to form two logical PEs in single physical PE, you 
> end up replicating all the MAC addresses associated with the root 
> sites in two bridge tables.

Your point above certainly does not sound to me as "it can't be done": 
some may think that the above is an acceptable cost, some others may 
find ways to make this "replication" with a low overhead, on some 
platforms the cost may be negligible, etc.


>
>
>> because Leafs and Roots need to talk to each other and thus we want 
>> them to be in the same MAC-VRF.
>
> The fact that Leafs and Roots need to talk to each other does not mean 
> that they *have* to be in the same MAC-VRF, you can rely on the local 
> MPLS dataplane inside the PE to carry the traffic between Roots and 
> Leaves can be passed between a Leaf MAC-VRF and a Root MAC-VRF (and 
> you can possibly implement a shortcut not involving MPLS encap/decap).
>
> Anything is possible but at what cost.

You know, for cost it is not always obvious to reach conclusions that 
are true for all implementations and all targets.

> The current proposal is very efficient in terms of forwarding path as 
> well as control plane.

Sure, but what I question is not the new solution but the lack of 
discussion on why using the existing specs was not considered good enough.


I think that my concern of clearly explaining the scenarios and 
motivations for this new spec could be addressed by splitting section 
2.2 into a 2.2.1 describing the approach from 2.1 and its possible 
drawbacks, and a 2.2.2 having essentially the content of current section 
2.2.

Here is a proposal:

2.2 Scenario 2: Leaf of Root site(s) per AC

    In these scenarii, a PE receives traffic from either Root OR Leaf
    sites (but not both) on a given Attachment Circuit (AC) of an EVI. In
    other words, an AC (ES or ES/VLAN) is either associated with Root(s)
    or Leaf(s) (but not both).

2.2.1 Scenario 2a: Leaf OR Root site(s) per AC, separate Leaf/Root MAC-VRFs

                      +---------+            +---------+
                      |   PE1   |            |   PE2   |
     +---+            |  +---+  |  +------+  |  +---+ |            +---+
     |CE1+-----ES1----+--+   |  |  |      |  | |MAC+--+---ES2/AC1--+CE2|
     +---+    (Leaf)  |  |MAC|  |  | MPLS |  |  |VRF|  | (Leaf)   +---+
                      |  |VRF|  |  |  /IP |  |  '---'  |
                      |  |   |  |  |      |  |  .---.  |
                      |  |   |  |  |      |  |  |MAC| |            +---+
                      |  |   |  |  |      |  | |VRF+--+---ES2/AC2--+CE3|
                      |  +---+  |  +------+  |  +---+  | (Root)   +---+
                      +---------+            +---------+

    Figure 2: Scenario 2a

    In this scenario, the RT constraint procedures described in section 
2.1 could
    also be used. The feasibility and efficiency of this approach depends on
    platforms specifics.

    This approach will lead toduplication of a large proportion of MAC 
addresseson
    PEs having both Leaf and Root sites, and is hence considered less 
suitable for
    deployment contexts where the vast majority of PEs are likely to 
ultimately
    have both Leaf and Root sites attached to them.

2.2.2 Scenario 2b: Leaf OR Root site(s) per AC, single MAC-VRF

                      +---------+ +---------+
                      |   PE1   |            |   PE2   |
     +---+            |  +---+  |  +------+  |  +---+ |            +---+
     |CE1+-----ES1----+--+   |  |  |      |  |  | +--+---ES2/AC1--+CE2|
     +---+    (Leaf)  |  |MAC|  |  | MPLS |  |  |MAC|  | (Leaf)   +---+
                      |  |VRF|  |  |  /IP |  |  |VRF|  |
                      |  |   |  |  |      |  |  |   | |            +---+
                      |  |   |  |  |      |  |  | +--+---ES2/AC2--+CE3|
                      |  +---+  |  +------+  |  +---+  | (Root)   +---+
                      +---------+            +---------+

    Figure 2: Scenario 2b

    This scenario will alleviate keys drawbacks from Scenario 2a, in 
particular
    by avoiding duplication of MAC addresses on Leaf/Root PEs and 
avoiding the
    operational overhead of managing more than one RT.

    This approach comes at the expense of having routes for unneeded MAC 
addresses on Leaf-only PEs, and is hence considered less suitable for 
deployment contexts where the vast majority of PEs would remain Leaf-only.    Unlike Scenario 1 and Scenario 2a, this scenario requires additional procedures
    provided in this document.


(And this last sentence should be added to section 2.3 as well)

>
>>> For this scenario, if for a given
>>>    EVI, the majority of PEs will eventually have both Leaf and Root
>>>    sites attached, even though they may start as Root-only or Leaf-only
>>>    PEs, then it is recommended to use a single RT per EVI and avoid
>>>    additional configuration and operational overhead.
>>
>> Why this recommendation ?
>> Even with a majority of PEs having both Leaves and Roots, there can 
>> remain (up to 49% of) PEs having only Leaves, which will uselessly 
>> have all routes to other Leaves.
>>
>> So "it is recommended" above, deserves to be explained more, I think.
>>
>> OK, I changed “majority” to “vast majority” :-)
>
> My point was not to nit pick on "majority", but was that you should 
> explain why you recommend that.
> As the text currently reads, the cost of the recommendation can be 
> identified: having useless routes on the fraction of PEs having only 
> Leaves.
> But the gain brought by the recommendation is not even mentioned, not 
> to say explained.
> Hence: why ?
> (Why is it a useful tradeoff to have useless routes on some, even if 
> only one, PE ?)
>
> Changed the last sentence from:
> "then it is recommended to use a single RT per EVI and avoid 
> additional configuration and operational overhead.”
> To
> "then it is recommended to use a single RT per EVI and avoid 
> additional configuration and operational overhead
> at the expense of having unwanted MAC addresses on the Leaf PEs."

Ok. I adapted and incorporated this addition into my proposed text 
splitting 2.2 into a 2.2.1 and a 2.2.2.

Best,

-Thomas