Re: [bess] WG Last Call for draft-ietf-bess-evpn-prefix-advertisement-04

"Rabadan, Jorge (Nokia - US/Mountain View)" <jorge.rabadan@nokia.com> Thu, 23 March 2017 01:31 UTC

Return-Path: <jorge.rabadan@nokia.com>
X-Original-To: bess@ietfa.amsl.com
Delivered-To: bess@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 38EE51293F2 for <bess@ietfa.amsl.com>; Wed, 22 Mar 2017 18:31:21 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.697
X-Spam-Level:
X-Spam-Status: No, score=-4.697 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-2.796, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=nokia.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 2GxgtJ0z1ehX for <bess@ietfa.amsl.com>; Wed, 22 Mar 2017 18:31:16 -0700 (PDT)
Received: from EUR01-VE1-obe.outbound.protection.outlook.com (mail-ve1eur01on0113.outbound.protection.outlook.com [104.47.1.113]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5042912422F for <bess@ietf.org>; Wed, 22 Mar 2017 18:31:15 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia.onmicrosoft.com; s=selector1-nokia-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=zCbTbPZJwdtxRGWR/x8o04kyXNyYA6yu8f/q2YojIB8=; b=Shlp5YnxOPm5K78WpRJ9Ckwq9kicaxIeTD6hzzIUB1KOUDgj2b0Js+XBAG+VmrEq2gC6jUYA8PJtWkgg2vjHWq6c1TLASMI/3kB5FaSeuph0Rkesn3izSxZIUDifvA83y3XF07FvyxD3f2WjhsCbWSFUDCFFscvdSLtgGgq1xJk=
Received: from DB5PR07MB0981.eurprd07.prod.outlook.com (10.161.200.151) by DB5PR07MB0984.eurprd07.prod.outlook.com (10.161.200.154) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.991.4; Thu, 23 Mar 2017 01:31:12 +0000
Received: from DB5PR07MB0981.eurprd07.prod.outlook.com ([fe80::ec4e:6da5:9e2a:4ab7]) by DB5PR07MB0981.eurprd07.prod.outlook.com ([fe80::ec4e:6da5:9e2a:4ab7%17]) with mapi id 15.01.0991.013; Thu, 23 Mar 2017 01:31:11 +0000
From: "Rabadan, Jorge (Nokia - US/Mountain View)" <jorge.rabadan@nokia.com>
To: Eric C Rosen <erosen@juniper.net>, "Vigoureux, Martin (Nokia - FR/Nozay)" <martin.vigoureux@nokia.com>, BESS <bess@ietf.org>
Thread-Topic: [bess] WG Last Call for draft-ietf-bess-evpn-prefix-advertisement-04
Thread-Index: AQHSjFCw/QxJGJ54gkqeZugGpTJ7OqGhXDCA
Date: Thu, 23 Mar 2017 01:31:11 +0000
Message-ID: <6DD233B4-D233-44AD-96D0-6AFFA0B02731@on.nokia.com>
References: <3035f4d6-163e-2f90-8462-74fa8801540b@nokia.com> <9fa61e3a-50e6-2b27-86a8-f0988194f5a9@juniper.net>
In-Reply-To: <9fa61e3a-50e6-2b27-86a8-f0988194f5a9@juniper.net>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/f.20.0.170309
authentication-results: juniper.net; dkim=none (message not signed) header.d=none;juniper.net; dmarc=none action=none header.from=nokia.com;
x-ms-exchange-messagesentrepresentingtype: 1
x-originating-ip: [99.104.125.72]
x-microsoft-exchange-diagnostics: 1; DB5PR07MB0984; 7:ATRrp5XCtsiekIbFsS+/RXLNjRhuxDmr/phsVa83wdlrunTEXU3B/ZuUjaGNkd8j0L23d3lNXQ6Xqs6rynx+8ij38UMw9RvmvgXbIGTsrUrXZJPhvsDylXo9P/FXS2atz1FDV5DRhGehvjzxeyBUBnNPoberrhKjy547Gsg+38Jovhp6IHSKEeQuPys0Sa+Bvex8u84USi4AJB0uzEIqZeit303tK/3047ZwEpPV2BUbm6FLyEDBXGNaaxOok1bv9Tiql5e0i+pwG+ibbuf63aVXmHyS05yhEZFEbhn2JSU72YxUvu5LXqj9+zfJm5OJQESobXz7x07X8bOw9oZ6Gg==
x-forefront-antispam-report: SFV:SKI; SCL:-1SFV:NSPM; SFS:(10019020)(6009001)(39850400002)(39410400002)(39450400003)(377454003)(24454002)(38730400002)(6246003)(345774005)(53946003)(83716003)(3660700001)(3846002)(6512007)(2906002)(3280700002)(6116002)(102836003)(33656002)(25786009)(2900100001)(82746002)(66066001)(83506001)(189998001)(8676002)(229853002)(6436002)(6486002)(53546009)(6506006)(2950100002)(99286003)(54356999)(8666007)(230783001)(50986999)(76176999)(53936002)(7736002)(305945005)(5890100001)(5250100002)(5660300001)(86362001)(81166006)(8936002)(559001)(579004); DIR:OUT; SFP:1102; SCL:1; SRVR:DB5PR07MB0984; H:DB5PR07MB0981.eurprd07.prod.outlook.com; FPR:; SPF:None; MLV:sfv; LANG:en;
x-ms-office365-filtering-correlation-id: 2f288a01-a3da-4700-7c16-08d4718c49e5
x-ms-office365-filtering-ht: Tenant
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(2017030254075)(48565401081); SRVR:DB5PR07MB0984;
x-microsoft-antispam-prvs: <DB5PR07MB0984CA3AD089F1548CDF301EF73F0@DB5PR07MB0984.eurprd07.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:(60795455431006)(158342451672863)(138986009662008)(200054503718035)(788757137089)(21532816269658)(17755550239193);
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040375)(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046)(6055026)(6041248)(20161123562025)(20161123558025)(20161123555025)(20161123560025)(20161123564025)(6072148); SRVR:DB5PR07MB0984; BCL:0; PCL:0; RULEID:; SRVR:DB5PR07MB0984;
x-forefront-prvs: 0255DF69B9
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="utf-8"
Content-ID: <696D1412D4F42A4BB69592CAE495B24F@eurprd07.prod.outlook.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: nokia.com
X-MS-Exchange-CrossTenant-originalarrivaltime: 23 Mar 2017 01:31:11.2787 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 5d471751-9675-428d-917b-70f44f9630b0
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB5PR07MB0984
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/RXjXGjud1RzNtA6ESNIcYvydBQA>
Subject: Re: [bess] WG Last Call for draft-ietf-bess-evpn-prefix-advertisement-04
X-BeenThere: bess@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bess>, <mailto:bess-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess/>
List-Post: <mailto:bess@ietf.org>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bess>, <mailto:bess-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 23 Mar 2017 01:31:21 -0000

Hi Eric,

Thank you for your thorough review. I made quite a few changes based on your and Jeffrey’s input.
Please see my responses in-line, one by one. Also please see the new version sent in my reply to Jeffrey.
Thx,
Jorge


On 2/21/17, 3:41 PM, "BESS on behalf of Eric C Rosen" <bess-bounces@ietf.org on behalf of erosen@juniper.net> wrote:

    While I would like to see this document advance eventually, I don't 
    think it is ready yet.
    
    Main points:
    
    - There is no clear explanation of the key concept of "overlay index".  
    In particular, there is no real explanation of when to use an overlay 
    index, or of when to use each kind of overlay index. There are some use 
    case descriptions, and some examples of the sort "in this use case use 
    this kind of overlay index", but no rules that specify the precise 
    circumstances under which it is appropriate to use each kind of overlay 
    index.
[JORGE] I added a specific section about this. Hope it helps clarify.
    
    - There are no rules given that say how an NVE knows whether to 
    originate an RT-5 route, or knows how to construct an RT-5 route.  A 
    number of use cases are walked through, which is helpful, but that is 
    not a substitute for a real specification.
[JORGE] in most of the cases, the use of one particular model is a matter of local policy. I made a few changes though based on your comments.
    
    - In the discussion of use case, there are statements like "support of 
    this use case is REQUIRED".  But it is very difficult to know exactly 
    which features of the protocol are being mandated.
[JORGE] I tried to clarify. Let us know if it helps.
    
    - Too much of the draft is the "RT-5's are good" sale pitch, which is 
    repeated three times.  (Sections 2.2, 4, and 6.)  A single time would do.
[JORGE] I removed section section 4 as suggested and made the conclusions section just a short summary.
    
    - The talk of IP-VRFs is a bit misleading, as this might be taken to 
    suggest that the document provides a way to interoperate L3VPN with EVPN.
[JORGE] Added this in the terminology section: “IP-VRF: A VPN Routing and Forwarding tables for IP addresses on an NVE/PE, similar to the VRF concept defined in [RFC4364], however, in this document, the IP routes are always populated by the EVPN address family.”
    
    I've attached the draft with some more comments in-line, look for lines 
    beginning with ****.
[JORGE] Please see my comments along yours.
    



-------------------------------------------------------

Abstract

**** Perhaps: "may not support their own" -->  "do not necessarily
**** participate in dynamic"
[JORGE] Done.

...

1. Terminology
...

   Overlay index: object used in the IP Prefix route, as described in
   this document. It can be an IP address in the tenant space or an ESI,
   and identifies a pointer yielded by the IP route lookup at the
   routing context importing the route. An overlay index always needs a
   recursive route resolution on the NVE receiving the IP Prefix route,
   so that the NVE knows to which egress NVE it needs to forward the
   packets.

**** I can't really understand this description of "overlay index", and the
**** concept is never really explained in this draft.  All we have is a set
**** of use cases, and we are told to set the overlay index in a certain way
**** for a particular use case.  It's never stated just how an NVE figures
**** out what to specify as "overlay index" in any given RT-5 route.

**** I think the "overlay index" is really intended to allow an NVE to
**** specify either an ESI or an IP address (in the address space of the
**** tenant system) as the next hop for a given IP prefix, where the
**** resolution of the next hop may lead to an NVE which is not the NVE
**** originating the route.  A short explanation of why this is needed in
**** EVPN (when it isn't, e.g., needed in L3VPN) would be useful.

[JORGE] OK, I removed the term from section 1 and created section 3.2 to explain the concept better.

   Underlay next-hop: IP address sent by BGP along with any EVPN route,
   i.e. BGP next-hop. It identifies the NVE sending the route and it is
   used at the receiving NVE as the VXLAN destination VTEP or NVGRE
   destination end-point.

**** In general, the BGP next hop does not identify the originator of the
**** route, as the route may have passed through one or more ASBRs that are
**** configured for "next hop self".  Unless you want the VXLAN tunnels to
**** terminate at the ASBR, it's a good idea to have a different means of
**** identifying the tunnel endpoint.

**** This also suggests that the document only applies to scenarios where
**** VXLAN tunneling is used, which I don't think is the intention.

[JORGE] OK, that’s fair. Moreover, I don’t think we need to explain what a BGP next-hop is. So I removed this “Underlay next-hop” point, it is confusing things.


   Ethernet NVO tunnel: it refers to Network Virtualization Overlay
   tunnels with Ethernet payload. Examples of this type of tunnels are
   VXLAN or nvGRE.

   IP NVO tunnel: it refers to Network Virtualization Overlay tunnels
   with IP payload (no MAC header in the payload). Examples of IP NVO
   tunnels are VXLAN GPE or MPLSoGRE (both with IP payload).

**** These examples are a bit odd, as MPLS can carry either frames or
**** packets as payload, and whether MPLS is itself carried inside GRE is
**** irrelevant. 
[JORGE] Yes, but that is why we added “(both with IP payload)”. I’ll remove the last sentence to avoid confusion.


2. Introduction and problem statement

   Inter-subnet connectivity is required for certain tenants within the
   Data Center. [EVPN-INTERSUBNET] defines some fairly common inter-
   subnet forwarding scenarios where TSes can exchange packets with TSes
 


Rabadan et al.          Expires August 17, 2017                 [Page 3]


Internet-Draft         EVPN Prefix Advertisement       February 13, 2017


   located in remote subnets. In order to meet this requirement,
   [EVPN-INTERSUBNET] describes how MAC/IPs encoded in TS RT-2 routes
   are not only used to populate MAC-VRF and overlay ARP tables, but
   also IP-VRF tables with the encoded TS host routes (/32 or /128). In
   some cases, EVPN may advertise IP Prefixes and therefore provide
   aggregation in the IP-VRF tables, as opposed to program individual
   host routes. This document complements the scenarios described in
   [EVPN-INTERSUBNET] and defines how EVPN may be used to advertise IP
   Prefixes.

**** I guess issues of interoperating EVPN with L3VPN are out of scope,
**** because there is no discussion of using EVPN routes to populate the
**** VRFs of an L3VPN or of using VPN-IP routes to populate the EVPN
**** IP-VRFs.  If this is really out of scope, please state that explicitly.

[JORGE] Added: “Interoperability between EVPN and L3VPN [RFC4364] IP Prefix
   routes is out of the scope of this document.”

   Section 2.1 describes the inter-subnet connectivity requirements in
   Data Centers. Section 2.2 explains why a new EVPN route type is
   required for IP Prefix advertisements. Once the need for a new EVPN
   route type is justified, sections 3, 4 and 5 will describe this route
   type and how it is used in some specific use cases.  

2.1 Inter-subnet connectivity requirements in Data Centers

   [RFC7432] is used as the control plane for a Network Virtualization
   Overlay (NVO3) solution in Data Centers (DC), where Network
   Virtualization Edge (NVE) devices can be located in Hypervisors or
   TORs, as described in [EVPN-OVERLAY].

   If we use the term Tenant System (TS) to designate a physical or
   virtual system identified by MAC and IP addresses, and connected to
   an EVPN instance, the following considerations apply:

**** Please say exactly what it means for a TS to be "connected to an EVPN
**** instance".    

[JORGE] added: “and connected to a MAC-VRF by an Attachment Circuit”

   o The Tenant Systems may be Virtual Machines (VMs) that generate
     traffic from their own MAC and IP.

   o The Tenant Systems may be Virtual Appliance entities (VAs) that
     forward traffic to/from IP addresses of different End Devices
     seating behind them.

        o These VAs can be firewalls, load balancers, NAT devices, other
          appliances or virtual gateways with virtual routing instances.

        o These VAs do not have their own routing protocols and hence
          rely on the EVPN NVEs to advertise the routes on their behalf.

**** The VAs have "virtual routing instances" but do not have routing
**** protocols?  Do you mean that the VAs may have multiple routing contexts
**** (e.g., one per tenant or one per virtual network), but do not
**** necessarily participate in dynamic routing protocols?

[JORGE] changed to 
“These VAs do not necessarily participate in dynamic routing protocols and hence rely on the EVPN NVEs to advertise the routes on their behalf.”


        o In all these cases, the VA will forward traffic to the Data
          Center using its own source MAC but the source IP will be the
          one associated to the End Device seating behind or a

**** "seating" --> "sitting"
[JORGE] changed.

**** Not sure how to interpret "forward traffic to the Data Center".  Isn't
**** the VA already in the data center?  

[JORGE] changed to:
“the VA will forward traffic to other TSes”

          translated IP address (part of a public NAT pool) if the VA is
          performing NAT.

        o Note that the same IP address could exist behind two of these
          TS. One example of this would be certain appliance resiliency
 


Rabadan et al.          Expires August 17, 2017                 [Page 4]


Internet-Draft         EVPN Prefix Advertisement       February 13, 2017


          mechanisms, where a virtual IP or floating IP can be owned by
          one of the two VAs running the resiliency protocol (the master
          VA). VRRP is one particular example of this. Another example
          is multi-homed subnets, i.e. the same subnet is connected to
          two VAs.

        o Although these VAs provide IP connectivity to VMs and subnets
          behind them, they do not always have their own IP interface
          connected to the EVPN NVE, e.g. layer-2 firewalls are examples
          of VAs not supporting IP interfaces.

   The following figure illustrates some of the examples described
   above.
                       NVE1                                            
                    +-----------+                                     
           TS1(VM)--|(MAC-VRF10)|-----+                               
             IP1/M1 +-----------+     |               DGW1            
                                  +---------+    +-------------+       
                                  |         |----|(MAC-VRF10)  |       
     SN1---+           NVE2       |         |    |    IRB1\    |      
           |        +-----------+ |         |    |     (IP-VRF)|---+   
     SN2---TS2(VA)--|(MAC-VRF10)|-|         |    +-------------+  _|_  
           | IP2/M2 +-----------+ |  VXLAN/ |                    (   ) 
     IP4---+  <-+                 |  nvGRE  |         DGW2      ( WAN )
                |                 |         |    +-------------+ (___) 
             vIP23 (floating)     |         |----|(MAC-VRF10)  |   |   
                |                 +---------+    |    IRB2\    |   |  
     SN1---+  <-+      NVE3         |  |  |      |     (IP-VRF)|---+   
           | IP3/M3 +-----------+   |  |  |      +-------------+       
     SN3---TS3(VA)--|(MAC-VRF10)|---+  |  |                            
           |        +-----------+      |  |                            
     IP5---+                           |  |                            
                                       |  |                            
                    NVE4               |  |      NVE5            +--SN5
              +---------------------+  |  | +-----------+        |     
     IP6------|(MAC-VRF1)           |  |  +-|(MAC-VRF10)|--TS4(VA)--SN6
              |       \             |  |    +-----------+        |    
              |    (IP-VRF)         |--+                ESI4     +--SN7
              |       /  \IRB3      |                                 
          |---|(MAC-VRF2)(MAC-VRF10)|                                  
       SN4|   +---------------------+                                  

                    Figure 1 DC inter-subnet use-cases

   Where:

   NVE1, NVE2, NVE3, NVE4, NVE5, DGW1 and DGW2 share the same EVI for a
   particular tenant. EVI-10 is comprised of the collection of MAC-VRF10
 


Rabadan et al.          Expires August 17, 2017                 [Page 5]


Internet-Draft         EVPN Prefix Advertisement       February 13, 2017


   instances defined in all the NVEs. All the hosts connected to EVI-10
   belong to the same IP subnet. The hosts connected to EVI-10 are
   listed below:

        o TS1 is a VM that generates/receives traffic from/to IP1, where
          IP1 belongs to the EVI-10 subnet.

        o TS2 and TS3 are Virtual Appliances (VA) that generate/receive
          traffic from/to the subnets and hosts seating behind them
          (SN1, SN2, SN3, IP4 and IP5). Their IP addresses (IP2 and IP3)
          belong to the EVI-10 subnet and they can also generate/receive
          traffic. When these VAs receive packets destined to their own
          MAC addresses (M2 and M3) they will route the packets to the
          proper subnet or host. These VAs do not support routing
          protocols to advertise the subnets connected to them and can
          move to a different server and NVE when the Cloud Management
          System decides to do so. These VAs may also support redundancy
          mechanisms for some subnets, similar to VRRP, where a floating
          IP is owned by the master VA and only the master VA forwards
          traffic to a given subnet. E.g.: vIP23 in figure 1 is a
          floating IP that can be owned by TS2 or TS3 depending on who
          the master is. Only the master will forward traffic to SN1.  

        o Integrated Routing and Bridging interfaces IRB1, IRB2 and IRB3
          have their own IP addresses that belong to the EVI-10 subnet
          too. These IRB interfaces connect the EVI-10 subnet to Virtual
          Routing and Forwarding (IP-VRF) instances that can route the
          traffic to other connected subnets for the same tenant (within
          the DC or at the other end of the WAN).

        o TS4 is a layer-2 VA that provides connectivity to subnets SN5,
          SN6 and SN7, but does not have an IP address itself in the
          EVI-10. TS4 is connected to a physical port on NVE5 assigned
          to Ethernet Segment Identifier 4.

   All the above DC use cases require inter-subnet forwarding and
   therefore the individual host routes and subnets: 

   a) MUST be advertised from the NVEs (since VAs and VMs do not run
      routing protocols) and
   b) MAY be associated to an overlay index that can be a VA IP address,
      a floating IP address or an ESI.

**** Still not understanding the "overlay index" concept.      
[JORGE] added: “An Overlay Index is a next-hop that requires a recursive resolution and it is described in section 3.2.”


2.2 The requirement for a new EVPN route type   

   [RFC7432] defines a MAC/IP route (also referred as RT-2) where a MAC
   address can be advertised together with an IP address length (IPL)
 


Rabadan et al.          Expires August 17, 2017                 [Page 6]


Internet-Draft         EVPN Prefix Advertisement       February 13, 2017


   and IP address (IP). While a variable IPL might have been used to
   indicate the presence of an IP prefix in a route type 2, there are
   several specific use cases in which using this route type to deliver
   IP Prefixes is not suitable.

   One example of such use cases is the "floating IP" example described
   in section 2.1. In this example we need to decouple the advertisement
   of the prefixes from the advertisement of the floating IP (vIP23 in
   figure 1) and MAC associated to it, otherwise the solution gets
   highly inefficient and does not scale. 

   E.g.: if we are advertising 1k prefixes from M2 (using RT-2) and the
   floating IP owner changes from M2 to M3, we would need to withdraw 1k
   routes from M2 and re-advertise 1k routes from M3. However if we use
   a separate route type, we can advertise the 1k routes associated to
   the floating IP address (vIP23) and only one RT-2 for advertising the
   ownership of the floating IP, i.e. vIP23 and M2 in the route type 2.
   When the floating IP owner changes from M2 to M3, a single RT-2
   withdraw/update is required to indicate the change. The remote DGW
   will not change any of the 1k prefixes associated to vIP23, but will
   only update the ARP resolution entry for vIP23 (now pointing at M3).

   Other reasons to decouple the IP Prefix advertisement from the MAC/IP
   route are listed below:

        o Clean identification, operation of troubleshooting of IP
          Prefixes, not subject to interpretation and independent of the
          IPL and the IP value. E.g.: a default IP route 0.0.0.0/0 must
          always be easily and clearly distinguished from the absence of
          IP information.

**** Good point, but of course a new route type is not required to deal with
**** this possible ambiguity.          
[JORGE] the route 0.0.0.0/0 example refers to the fact that in a RT2, an IPL=0 always means no IP. Whereas in RT5 IPL=0 is used to indicate the mask of the default route 0.0.0.0.
A new route type is not absolutely required for this – we could have redefined RT2 – but we agreed it was the best and cleaner way. I prefer to leave the text as it is...

        o MAC address information must not be compared by BGP when
          selecting two IP Prefix routes.

**** Perhaps: "selecting" --> "choosing which of several IP Prefix routes to
**** install in a given IP-VRF" 
[JORGE] good one. I changed it.         

          If IP Prefixes were to be
          advertised using MAC/IP routes, the MAC information would
          always be present and part of the route key.

**** Presumably the decision procedure for installing prefixes into an IPVRF
**** could involve choosing among multiple different route types that have
**** the same prefix: RT-2, RT-5, and (for L3VPN interop) SAFI-128.  What
**** exactly is the process for choosing among these?
[JORGE] L3VPN is out of scope in this document as mentioned (now) in the intro. About RT-2 and RT-5 carrying host routes (both can), IMO we should add a line on the inter-subnet-forwarding draft, since it is the draft that talks about RT2 and RT5. If you think we should add the sentence here, we could do it too. Let us know.


        o IP Prefix routes must not be subject to MAC/IP route
          procedures such as MAC mobility or aliasing. Prefixes
          advertised from two different ESIs do not mean mobility; MACs
          advertised from two different ESIs do mean mobility. Similarly
          load balancing for IP prefixes is achieved through IP
          mechanisms such as ECMP, and not through MAC route mechanisms
          such as aliasing.

**** This is true, but it's not clear why the use of RT-2's is ruled out by
**** this.  

        o NVEs that do not require processing IP Prefixes must have an
          easy way to identify an update with an IP Prefix and ignore
          it, rather than processing the MAC/IP route to find out only
          later that it carries a Prefix that must be ignored.

**** How does an NVE know whether it is required to process IP prefixes?

**** The above points seem a bit weak, I think the real reason for having
**** the RT-5 is the one given at the start fo this section.

[JORGE] OK, I removed the last two bullets since they may seem to create more noise than clarifying things. 

 


Rabadan et al.          Expires August 17, 2017                 [Page 7]


Internet-Draft         EVPN Prefix Advertisement       February 13, 2017


   The following sections describe how EVPN is extended with a new route
   type for the advertisement of IP prefixes and how this route is used
   to address the current and future inter-subnet connectivity
   requirements existing in the Data Center.

3. The BGP EVPN IP Prefix route

   The current BGP EVPN NLRI as defined in [RFC7432] is shown below:

    +-----------------------------------+
    |    Route Type (1 octet)           |
    +-----------------------------------+
    |     Length (1 octet)              |
    +-----------------------------------+
    | Route Type specific (variable)    |
    +-----------------------------------+

   Where the route type field can contain one of the following specific
   values:

   + 1 - Ethernet Auto-Discovery (A-D) route

   + 2 - MAC/IP advertisement route

   + 3 - Inclusive Multicast Route

   + 4 - Ethernet Segment Route


**** Refer to the IANA "EVPN Route Types" registry.   
[JORGE] done

   This document defines an additional route type that will be used for
   the advertisement of IP Prefixes:

   + 5 - IP Prefix Route

**** Say that IANA has added this value to the registry.   
[JORGE] done

   The support for this new route type is OPTIONAL. 

   Since this new route type is OPTIONAL, an implementation not
   supporting it MUST ignore the route, based on the unknown route type
   value.

**** A reference to Section 5.4 of RFC 7606 would be useful here.  
[JORGE] done. Added to normative references too.

   The detailed encoding of this route and associated procedures are
   described in the following sections.


3.1 IP Prefix Route encoding

   An IP Prefix advertisement route NLRI consists of the following
   fields:

 


Rabadan et al.          Expires August 17, 2017                 [Page 8]


Internet-Draft         EVPN Prefix Advertisement       February 13, 2017


    +---------------------------------------+
    |      RD   (8 octets)                  |
    +---------------------------------------+
    |Ethernet Segment Identifier (10 octets)|
    +---------------------------------------+
    |  Ethernet Tag ID (4 octets)           |
    +---------------------------------------+
    |  IP Prefix Length (1 octet)           |
    +---------------------------------------+
    |  IP Prefix (4 or 16 octets)           |
    +---------------------------------------+
    |  GW IP Address (4 or 16 octets)       |
    +---------------------------------------+
    |  MPLS Label (3 octets)                |
    +---------------------------------------+

   Where:

        o RD, Ethernet Tag ID and MPLS Label fields will be used as
          defined in [RFC7432] and [EVPN-OVERLAY].

**** This seems to make [EVPN-OVERLAY] a normative reference.  However, it
**** is listed below as an informational reference.  
[JORGE] Changed.        

        o The Ethernet Segment Identifier will be a non-zero 10-byte
          identifier if the ESI is used as an overlay index. It will be
          zero otherwise.

**** "If the ESI is used as an overlay index", what does that mean exactly?
[JORGE] Added: “(see the definition of overlay index in section 3.2)”          

        o The IP Prefix Length can be set to a value between 0 and 32
          (bits) for ipv4 and between 0 and 128 for ipv6.

**** Please mention that this is an unsigned number that specifies the
**** number of bits in the prefix.   
[JORGE] what do you mean by “unsigned”?       

        o The IP Prefix will be a 32 or 128-bit field (ipv4 or ipv6).

**** Please mention that the size of this field does not depend on the value
**** of the IPL field.        
[JORGE] Added.

        o The GW IP (Gateway IP Address) will be a 32 or 128-bit field
          (ipv4 or ipv6), and will encode an overlay IP index for the IP
          Prefixes. The GW IP field SHOULD be zero if it is not used as
          an overlay index.

**** "overlay index" again.    
[JORGE] Added: “Refer to section 3.2 for the definition and use of the Overlay Index.”        

        o The MPLS Label field is encoded as 3 octets, where the high-
          order 20 bits contain the label value. The value SHOULD be
          null when the IP Prefix route is used for a recursive lookup
          resolution.

**** What numerical value represents "null"?  How does one know that the
**** "null" value isn't the real label value being assigned?
[JORGE] It is zero. I added a note. As far as I know, RFC7432 uses value “zero” in MPLS label fields ONLY in case the label field is not used (for instance, AD per-ES route, label must be set to 0 since it is not used). 

**** What does one do if the value is not null, but it is necessary to do a
**** recursive lookup?
[JORGE] Added “If the received MPLS Label value is not null, the route MUST still be used for recursive lookup resolution if the local policy instructs the ingress NVE to do so.”

        o The total route length will indicate the type of prefix (ipv4
          or ipv6) and the type of GW IP address (ipv4 or ipv6). Note
          that the IP Prefix + the GW IP should have a length of either
          64 or 256 bits, but never 160 bits (ipv4 and ipv6 mixed values
          are not allowed).

   The Eth-Tag ID, IP Prefix Length and IP Prefix will be part of the
   route key used by BGP to compare routes. The rest of the fields will
 


Rabadan et al.          Expires August 17, 2017                 [Page 9]


Internet-Draft         EVPN Prefix Advertisement       February 13, 2017


   not be part of the route key.

**** Hopefully Route Reflectors will ignore this statement, so that they can
**** propagate routes that differ only in their RDs.  Perhaps the statement
**** is meant to apply only when comparing routes that are being imported
**** into a given IPVRF?
[JORGE] This is consistent with RFC7432, section 7, in which the RD is assumed to be part of the route key but not mentioned. If we don’t make the description inconsistent, it may be confusing?
I left the text as it is for the time being.
   

   The route will contain a single overlay index at most, i.e. if the
   ESI field is different from zero, the GW IP field will be zero, and
   vice versa.

**** And what if one receives a route for which this is not the case?   
[JORGE] Added “A route containing more than one Overlay Index will be treated as-withdraw”


   The following table shows the different inter-subnet use-
   cases described in this document and the corresponding coding of the
   overlay index in the route type 5 (RT-5). The IP-VRF-to-IP-VRF or IRB
   forwarding on NVEs case is a special use-case, where there may be no
   need for overlay index, since the actual next-hop is given by the BGP
   next-hop. When an overlay index is present in the RT-5, the receiving
   NVE will need to perform a recursive route resolution to find out to
   which egress NVE to forward the packets.

**** I think we need a precise set of rules saying exactly what the "overlay
**** index" is, and how the originator of the update knows which kind of
**** overlay index to encode into the update.  How does an NVE know which
**** "use case" it is in?
[JORGE] please see the new section 3.2 and let us know if it addresses your comments.

**** Is the below list of use cases exhaustive?  If not, what do we do when
**** we encounter another use case?

[JORGE] RT-5 only allows GW IP, ESI, MAC or N/A Overlay Indexed, being N/A == no recursive lookup, no overlay index.
The below use-cases should be representative enough since it uses all the above cases. Any other use case “x” should follow the rules of the use case where the same Overlay Index is used. Added: “The above use-cases are representative of the different Overlay Indexes supported by RT-5 (GW IP, ESI, MAC or N/A). Any other use-case using a given Overlay Index, SHOULD follow the procedures described in this document for the same Overlay Index.”

   +----------------------------+--------------------------------------+
   | Use-case                   | Overlay Index in the RT-5 BGP update |
   +----------------------------+--------------------------------------+
   | TS IP address              | Overlay GW IP Address                |
   | Floating IP address        | Overlay GW IP Address                |
   | "Bump in the wire"         | ESI                                  |
   | IP-VRF-to-IP-VRF           | Overlay GW IP, MAC or N/A            |
   +----------------------------+--------------------------------------+


4. Benefits of using the EVPN IP Prefix route

**** This section is primarily repeating material from Section 2.2, though
**** with a bit more explanation.  Maybe this section should be moved
**** forward to replace 2.2 entirely.

[JORGE] OK, I removed this section. We can expand on section 2.2 if needed. 

<snip>


5. IP Prefix overlay index use-cases

   The IP Prefix route can use a GW IP or an ESI as an overlay index as
   well as no overlay index whatsoever. This section describes some use-
   cases for these index types.

**** I find it very difficult to isolate the salient differences between
**** these use cases, or to say exactly why each use case requires the
**** particular type of "overlay index" that is given in the example.

[JORGE] ok, I added a paragraph at the beginning of each use-case trying to explain why a given overlay index type is used. I hope it helps.

<snip>

5.4 IP-VRF-to-IP-VRF model

   This use-case is similar to the scenario described in "IRB forwarding
   on NVEs for Tenant Systems" in [EVPN-INTERSUBNET], however the new
   requirement here is the advertisement of IP Prefixes as opposed to
   only host routes. 

   In the examples described in sections 5.1, 5.2 and 5.3, the MAC-VRF
   instance can connect IRB interfaces and any other Tenant Systems
   connected to it. EVPN provides connectivity for:

   1. Traffic destined to the IRB IP interfaces as well as

   2. Traffic destined to IP subnets seating behind the TS, e.g. SN1 or
      SN2.

**** "seating" --> "sitting" (also below)  
[JORGE] ok, changed all the occurrences.    

   In order to provide connectivity for (1), MAC/IP routes (RT-2) are
   needed so that IRB MACs and IPs can be distributed. Connectivity type
   (2) is accomplished by the exchange of IP Prefix routes (RT-5) for
   IPs and subnets seating behind certain overlay indexes, e.g. GW IP or
   ESI.

   In some cases, IP Prefix routes may be advertised for subnets and IPs
   seating behind an IRB. We refer to this use-case as the "IP-VRF-to-
   IP-VRF" model.

**** As long as the IP-VRF is the EVPN type of IP-VRF, not the L3VPN type? 
[JORGE] added: “and EVPN is the only enabled SAFI in the network.”  

   [EVPN-INTERSUBNET] defines an asymmetric IRB model and a symmetric
   IRB model, based on the required lookups at the ingress and egress
   NVE: the asymmetric model requires an ip-lookup and a mac-lookup at
   the ingress NVE, whereas only a mac-lookup is needed at the egress
   NVE; the symmetric model requires ip and mac lookups at both, ingress
   and egress NVE. From that perspective, the IP-VRF-to-IP-VRF use-case
   described in this section is a symmetric IRB model. Note that in an
   IP-VRF-to-IP-VRF scenario, a PE may not be configured with any MAC-
   VRF for a given tenant, in which case it will only be doing IP
 

**** Perhaps: "a PE may not be configured with any MAC-VRF for a given
**** tenant" --> "a PE may have only an IP-VRF, but no MAC-VRF, for a given
**** tenant". 
[JORGE] done.
   
Rabadan et al.          Expires August 17, 2017                [Page 18]


Internet-Draft         EVPN Prefix Advertisement       February 13, 2017


   lookups and forwarding for that tenant.

   Based on the way the IP-VRFs are interconnected, there are three
   different IP-VRF-to-IP-VRF scenarios identified and described in this
   document:

   1) Interface-less model
   2) Interface-full with core-facing IRB model
   3) Interface-full with unnumbered core-facing IRB model

**** I guess that would be "interface-ful".  But it's not apparent why a
**** certain case is called "interface-ful" and a certain case called
**** "interface-less". 
[JORGE] I don’t understand why “interface-ful” and not “full”? This refers to the need to define an IRB interface or not when connecting IP-VRFs. The interface-less model is similar to IP-VPN where the IP-VRFs are connected by tunnels. I thought the names that we agreed helped understand the differences, but let us know otherwise.



5.4.1 Interface-less IP-VRF-to-IP-VRF model

   Figure 6 will be used for the description of this model. 


                         NVE1(M1)
                +------------+
        IP1+----|(MAC-VRF1)  |                DGW1(M3)
                |      \     |    +---------+ +--------+
                |    (IP-VRF)|----|         |-|(IP-VRF)|----+
                |      /     |    |         | +--------+    |
            +---|(MAC-VRF2)  |    |         |              _+_
            |   +------------+    |         |             (   )
         SN1|                     |  VXLAN/ |            ( WAN )
            |            NVE2(M2) |  nvGRE/ |             (___)
            |   +------------+    |  MPLS   |               +
            +---|(MAC-VRF2)  |    |         | DGW2(M4)      |
                |       \    |    |         | +--------+    |
                |    (IP-VRF)|----|         |-|(IP-VRF)|----+
                |       /    |    +---------+ +--------+
        SN2+----|(MAC-VRF3)  |
                +------------+




         Figure 6 Interface-less IP-VRF-to-IP-VRF model

   In this case, the requirements are the following:

   a) The NVEs and DGWs must provide connectivity between hosts in SN1,
      SN2, IP1 and hosts seating at the other end of the WAN.

   b) The IP-VRF instances in the NVE/DGWs are directly connected
      through NVO tunnels,

**** What does that mean, exactly?
[JORGE] it means that the the IP-VRFs are connected by tunnels directly, no IRB interfaces. Similar to IP-VPN.

      and no IRBs and/or MAC-VRF instances are
      defined at the core.

**** What is meant by "defined at the core"?      
[JORGE] I added “and no IRBs and/or MAC-VRF instances are instantiated to connect the IP-VRFs” – let us know if it reads better.

 


Rabadan et al.          Expires August 17, 2017                [Page 19]


Internet-Draft         EVPN Prefix Advertisement       February 13, 2017


   c) The solution must provide layer-3 connectivity among the IP-VRFs
      for Ethernet NVO tunnels, for instance, VXLAN or nvGRE.

**** I'm having trouble parsing that sentence.  Does it mean that the
**** solution must cause IP datagrams to travel between two IP-VRFs through
**** tunnels that can only carry ethernet frames?
[JORGE] It means the solution must support Ethernet tunnels to connect two IP-VRFs. That is, IP packets can travel between two IP-VRFs but using Ethernet headers. VXLAN only supports the tunneling of Ethernet frames and not IP directly. Let us know if you would add any text clarifying this.


   d) The solution may provide layer-3 connectivity among the IP-VRFs
      for IP NVO tunnels, for example, VXLAN GPE (with IP payload).

**** I don't understand this, the solution as described in this document
**** does provide for the use of IP NVO tunnels, so what is meant by "may
**** provide"?
[JORGE] “may provide” because this is EVPN, so normally we assume tunnels with Ethernet in the payload, but in the interface-less model the IP-VRFs are directly connected by tunnels (no MAC-VRFs) so you “may” use tunnels with IP only in the payload (no need for Ethernet header). Let us know if you want to add any text clarifying this.


   In order to meet the above requirements, the EVPN route type 5 will
   be used to advertise the IP Prefixes, along with the Router's MAC
   Extended Community as defined in [EVPN-INTERSUBNET] if the
   advertising NVE/DGW uses Ethernet NVO tunnels. Each NVE/DGW will
   advertise an RT-5 for each of its prefixes with the following fields:

        o RD as per [RFC7432].

        o Eth-Tag ID=0 assuming VLAN-based service.

**** Why are we assuming VLAN-based service?        
[JORGE] ok, I removed the “assuming ...” bit since there is no need to add eth-tag in this case anyway.

        o IP address length and IP address, as explained in the previous
          sections.

        o GW IP address= SHOULD be set to 0.

**** What are the conditions under which it is okay to set it to non-zero?     
[JORGE] I don’t remember why we wrote SHOULD. I’ll change it to “GW IP address=0” if no one has any opinion against it.   


        o ESI=0

        o MPLS label or VNI corresponding to the IP-VRF.

   Each RT-5 will be sent with a route-target identifying the tenant
   (IP-VRF)

**** Presumably the RT identifies the set of IP-VRFs into which the update
**** may be imported.  Is there any relationship between this RT and other
**** RTs used by EVPN?
[JORGE] In this model the IP-VRFs only exchange RT-5s, so there is no relationship with other route type route-targets if that’s what you’re asking?

   and two BGP extended communities:

        o The first one is the BGP Encapsulation Extended Community, as
          per [RFC5512], identifying the tunnel type.

        o The second one is the Router's MAC Extended Community as per
          [EVPN-INTERSUBNET] containing the MAC address associated to
          the NVE advertising the route. This MAC address identifies the
          NVE/DGW and MAY be re-used for all the IP-VRFs in the NVE. The
          Router's MAC Extended Community MUST be sent if the route is
          associated to an Ethernet NVO tunnel, for instance, VXLAN. If
          the route is associated to an IP NVO tunnel, for instance
          VXLAN GPE with IP payload, the Router's MAC Extended Community
          SHOULD NOT be sent.

   The following example illustrates the procedure to advertise and
   forward packets to SN1/24 (ipv4 prefix advertised from NVE1) for
   VXLAN tunnels:

**** Leaving it as an exercise for the reader to figure out what to do in
**** the case of other tunnel types?   
[JORGE] good point. I changed the text to generalize it for any tunnel type.

   (1) NVE1 advertises the following BGP route:

        o Route type 5 (IP Prefix route) containing:
 


Rabadan et al.          Expires August 17, 2017                [Page 20]


Internet-Draft         EVPN Prefix Advertisement       February 13, 2017


          . IPL=24, IP=SN1, VNI=10.

          . GW IP= SHOULD be set to 0.

          . [RFC5512] BGP Encapsulation Extended Community with Tunnel-
            type=VXLAN.

          . Router's MAC Extended Community that contains M1.

          . Route-target identifying the tenant (IP-VRF).

   (2) DGW1 imports the received routes from NVE1:

        o DGW1 installs SN1/24 in the IP-VRF identified by the RT-5
          route-target.

        o Since GW IP=0 and the VNI is a valid value, DGW1 will use the
          VNI and next-hop of the RT-5, as well as the MAC address
          conveyed in the Router's MAC Extended Community (as inner
          destination MAC address) to encapsulate the routed IP packets.

   (3) When DGW1 receives a packet from the WAN with destination IPx,
       where IPx belongs to SN1/24:

        o A destination IP lookup is performed on the DGW1 IP-VRF
          routing table. The lookup yields SN1/24.

        o Since the RT-5 for SN1/24 had a GW IP=0 and a valid VNI and
          next-hop (used as destination VTEP), DGW1 will not need a
          recursive lookup to resolve the route.

        o The IP packet destined to IPx is encapsulated with: Source
          inner MAC = DGW1 MAC, Destination inner MAC = M1, Source outer
          IP (source VTEP) = DGW1 IP, Destination outer IP (destination
          VTEP) = NVE1 IP.

   (4) When the packet arrives at NVE1:

        o NVE1 will identify the IP-VRF for an IP-lookup based on the
          VNI.

        o An IP lookup is performed in the routing context, where SN1
          turns out to be a local subnet associated to MAC-VRF2. A
          subsequent lookup in the ARP table and the MAC-VRF FIB will
          provide the forwarding information for the packet in MAC-VRF2.

   The implementation of this Interface-less model is REQUIRED.

**** I don't think this makes it clear just exactly what is REQUIRED.

**** It's certainly not clear why this is called an "interface-less" model.
[JORGE] I added: “The implementation described above is called Interface-less model since the IP-VRFs are connected directly through tunnels and they don't require those tunnels to be terminated in MAC-VRFs instead, like in sections 4.4.2 or 4.4.3. An EVPN IP-VRF-to-IP-VRF implementation is REQUIRED to support the ingress and egress procedures described in this section.”
Let me know if it helps.


 


Rabadan et al.          Expires August 17, 2017                [Page 21]


Internet-Draft         EVPN Prefix Advertisement       February 13, 2017


5.4.2 Interface-full IP-VRF-to-IP-VRF with core-facing IRB 

   Figure 7 will be used for the description of this model.


                       NVE1
              +------------+                       DGW1
      IP1+----+(MAC-VRF1)  | +---------------+ +------------+
              |  \      (core)              (core)          |
              |(IP-VRF)(MAC-VRF)           (MAC-VRF)(IP-VRF)|-----+
              |  /    IRB(IP1/M1)         IRB(IP3/M3)       |     |
          +---+(MAC-VRF2)  | |               | +------------+    _+_
          |   +------------+ |               |                  (   )
       SN1|                  |     VXLAN/    |                 ( WAN )
          |            NVE2  |     nvGRE/    |                  (___)
          |   +------------+ |     MPLS      |     DGW2           +
          +---+(MAC-VRF2)  | |               | +------------+     |
              |  \      (core)              (core)          |     |
              |(IP-VRF)(MAC-VRF)           (MAC-VRF)(IP-VRF)|-----+
              |  /   IRB(IP2/M2)          IRB(IP4/M4)       |
      SN2+----+(MAC-VRF3)  | +---------------+ +------------+
              +------------+


         Figure 7 Interface-full with core-facing IRB model

   In this model, the requirements are the following: 

   a) As in section 5.4.1, the NVEs and DGWs must provide connectivity
      between hosts in SN1, SN2, IP1 and hosts seating at the other end
      of the WAN.

   b) However, the NVE/DGWs are now connected through Ethernet NVO
      tunnels terminated in core-MAC-VRF instances. The IP-VRFs use IRB
      interfaces for their connectivity to the core MAC-VRFs.

**** What exactly is a "core-MAC-VRF instance" or a "core-facing IRB"?
[JORGE] the NVEs and DGWs are connected by tunnels that must be terminated in MAC-VRFs instead of the IP-VRFs directly. Thos MAC-VRFs are the “core-MAC-VRFs” and they are linked to the IP-VRFs via “core-IRB” interfaces as per the Figure.

   c) Each core-facing IRB has an IP and a MAC address, where the IP
      address must be reachable from other NVEs or DGWs. 

   d) The core EVI is composed of the NVE/DGW MAC-VRFs and may contain
      other MAC-VRFs without IRB interfaces. Those non-IRB MAC-VRFs will
      typically connect TSes that need layer-3 connectivity to remote
      subnets.

   e) The solution must provide layer-3 connectivity for Ethernet NVO
      tunnels, for instance, VXLAN or nvGRE.

   EVPN type 5 routes will be used to advertise the IP Prefixes, whereas
 


Rabadan et al.          Expires August 17, 2017                [Page 22]


Internet-Draft         EVPN Prefix Advertisement       February 13, 2017


   EVPN RT-2 routes will advertise the MAC/IP addresses of each core-
   facing IRB interface. Each NVE/DGW will advertise an RT-5 for each of
   its prefixes with the following fields:

        o RD as per [RFC7432].

        o Eth-Tag ID=0 assuming VLAN-based service.

        o IP address length and IP address, as explained in the previous
          sections.

        o GW IP address=IRB-IP (this is the overlay index that will be
          used for the recursive route resolution).

        o ESI=0

        o MPLS label or VNI corresponding to the IP-VRF. Note that the
          value SHOULD be zero

**** and if it isn't?
[JORGE] Added: “The RT-5's Label field will be ignored on reception”

          since the RT-5 route requires a recursive
          lookup resolution to an RT-2 route. The MPLS label or VNI to
          be used when forwarding packets will be derived from the RT-
          2's MPLS Label1 field.

   Each RT-5 will be sent with a route-target identifying the tenant
   (IP-VRF). The Router's MAC Extended Community SHOULD NOT be sent in
   this case.

   The following example illustrates the procedure to advertise and
   forward packets to SN1/24 (ipv4 prefix advertised from NVE1) for
   VXLAN tunnels:

   (1) NVE1 advertises the following BGP routes:

        o Route type 5 (IP Prefix route) containing:

          . IPL=24, IP=SN1, VNI= SHOULD be set to 0.

          . GW IP=IP1 (core-facing IRB's IP)

          . Route-target identifying the tenant (IP-VRF).

        o Route type 2 (MAC/IP route for the core-facing IRB)
          containing:

          . ML=48, M=M1, IPL=32, IP=IP1, VNI=10.

          . A [RFC5512] BGP Encapsulation Extended Community with
            Tunnel-type= VXLAN.

 


Rabadan et al.          Expires August 17, 2017                [Page 23]


Internet-Draft         EVPN Prefix Advertisement       February 13, 2017


          . Route-target identifying the tenant. This route-target MAY
            be the same as the one used with the RT-5.

   (2) DGW1 imports the received routes from NVE1:

        o DGW1 installs SN1/24 in the IP-VRF identified by the RT-5
          route-target.

          . Since GW IP is different from zero, the GW IP (IP1) will be
            used as the overlay index for the recursive route resolution
            to the RT-2 carrying IP1.

   (3) When DGW1 receives a packet from the WAN with destination IPx,
       where IPx belongs to SN1/24:

        o A destination IP lookup is performed on the DGW1 IP-VRF
          routing table. The lookup yields SN1/24, which is associated
          to the overlay index IP1. The forwarding information is
          derived from the RT-2 received for IP1.

        o The IP packet destined to IPx is encapsulated with: Source
          inner MAC = M3, Destination inner MAC = M1, Source outer IP
          (source VTEP) = DGW1 IP, Destination outer IP (destination
          VTEP) = NVE1 IP.

   (4) When the packet arrives at NVE1:

        o NVE1 will identify the IP-VRF for an IP-lookup based on the
          VNI and the inner MAC DA.

        o An IP lookup is performed in the routing context, where SN1
          turns out to be a local subnet associated to MAC-VRF2. A
          subsequent lookup in the ARP table and the MAC-VRF FIB will
          provide the forwarding information for the packet in MAC-VRF2.

   The implementation of the Interface-full with core-facing IRB model
   is REQUIRED.

**** Again, I don't think I could say from this precisely which features are
**** required.

[JORGE] Added:
“The model described above is called Interface-full with core-facing IRB model since the tunnels connecting the DGWs and NVEs need to be terminated into core MAC-VRFs. Those MAC-VRFs are connected to the IP-VRFs via core-facing IRB interfaces. An EVPN IP-VRF-to-IP-VRF implementation is REQUIRED to support the ingress and egress procedures described in this section.”

**** I'm not sure I could even say what the salient differences are between
**** this use case and the previous one.  In both cases, one ends up
**** tunneling packets to NVE1.  Why exactly does one case require
**** indirection (i.e., recursive resolution of an overlay index) and the
**** other not?
[JORGE] Both implementations exist in the market today. Both have pros and cons. Interface-less has a simpler control-plane, but no indirection. The interface-full model, the opposite. The indirection is good since you propagate the state of the GW IP with a single update/withdraw. Also, note that the DGW part of the interface-full model is exactly the same as for the other use-cases that use a GW IP Overlay Index, hence a DGW implementation may decide to do this and cover a handful of use-cases in this draft.




<snip>



6. Conclusions

   An EVPN route (type 5) for the advertisement of IP Prefixes is
   described in this document. This new route type has a differentiated
   role from the RT-2 route and addresses all the Data Center (or NVO-
   based networks in general) inter-subnet connectivity scenarios in
   which an IP Prefix advertisement is required.

**** How do we know it addresses all the requirements for all use cases?
[JORGE] Changed to:
“This new route type has a differentiated role from the RT-2 route and addresses the Data Center (or NVO-based networks in general) inter-subnet connectivity scenarios described in this document”

   Using this new RT-5, an
   IP Prefix may be advertised along with an overlay index that can be a
 


Rabadan et al.          Expires August 17, 2017                [Page 27]


Internet-Draft         EVPN Prefix Advertisement       February 13, 2017


   GW IP address, a MAC or an ESI, or without an overlay index, in which
   case the BGP next-hop will point at the egress NVE

**** The BGP next hop is not necessarily the egress NVE.
[JORGE] changed to: “the BGP next-hop will point at the egress NVE/ASBR/ABR”

   and the MAC in the
   Router's MAC Extended Community will provide the inner MAC
   destination address to be used. As discussed throughout the document,
   the EVPN RT-2 does not meet the requirements for all the DC use
   cases, therefore this EVPN route type 5 is required.

   The EVPN route type 5 decouples the IP Prefix advertisements from the
   MAC/IP route advertisements in EVPN, hence:

**** This is the third time this information is being repeated.  
[JORGE] OK, I summarized these points. This is supposed to be just a summary. 
 

<snip>

9. IANA Considerations

   This document requests the allocation of value 5 in the "EVPN Route
   Types" registry defined by [RFC7432] and modification of the registry
   as follows:

   Value     Description         Reference
   5         IP Prefix route     [this document]
 


Rabadan et al.          Expires August 17, 2017                [Page 28]


Internet-Draft         EVPN Prefix Advertisement       February 13, 2017


   6-255     Unassigned

**** Since this document is not creating the registry, I believe the proprer
**** procedure is to request the codepoint you want, but not to assign values
**** for other codepoints.

[JORGE] ok, I removed the unassigned range.