Re: [bess] WG Last Call for draft-ietf-bess-evpn-inter-subnet-forwarding-03

"Jeffrey (Zhaohui) Zhang" <zzhang@juniper.net> Tue, 21 February 2017 19:27 UTC

Return-Path: <zzhang@juniper.net>
X-Original-To: bess@ietfa.amsl.com
Delivered-To: bess@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 28A04129C7E for <bess@ietfa.amsl.com>; Tue, 21 Feb 2017 11:27:02 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.788
X-Spam-Level:
X-Spam-Status: No, score=-3.788 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, RCVD_IN_MSPIKE_H2=-1.887, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=junipernetworks.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id B6kDyLD9CtRV for <bess@ietfa.amsl.com>; Tue, 21 Feb 2017 11:26:59 -0800 (PST)
Received: from NAM01-BY2-obe.outbound.protection.outlook.com (mail-by2nam01on0126.outbound.protection.outlook.com [104.47.34.126]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B27C51296D4 for <bess@ietf.org>; Tue, 21 Feb 2017 11:26:59 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=junipernetworks.onmicrosoft.com; s=selector1-juniper-net; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version; bh=vzLhuaaqKO7kg+fYmRnV35dueg4c0b4eVzDGuwuMZCo=; b=FQ3bxqUxPJOazLFKyI6v8y59+3cQTN8dZPQ7j0zaZSHhqfIlqjwgTAZc5vRw+mtZsqtv5fAEmbP2INGYbwLm6+jLvc3ZIbjCdoOjjehwLtDKDwOEpmIvKr1a2I0ycL9GIfwD9xyBH7rYoSOHD1ogUXWIPGtiCwJlWhRXD9dwNhc=
Received: from DM5PR05MB3145.namprd05.prod.outlook.com (10.173.219.15) by DM5PR05MB3145.namprd05.prod.outlook.com (10.173.219.15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384_P384) id 15.1.933.7; Tue, 21 Feb 2017 19:26:56 +0000
Received: from DM5PR05MB3145.namprd05.prod.outlook.com ([10.173.219.15]) by DM5PR05MB3145.namprd05.prod.outlook.com ([10.173.219.15]) with mapi id 15.01.0933.011; Tue, 21 Feb 2017 19:26:56 +0000
From: "Jeffrey (Zhaohui) Zhang" <zzhang@juniper.net>
To: "Ali Sajassi (sajassi)" <sajassi@cisco.com>, Martin Vigoureux <martin.vigoureux@nokia.com>, BESS <bess@ietf.org>
Thread-Topic: [bess] WG Last Call for draft-ietf-bess-evpn-inter-subnet-forwarding-03
Thread-Index: AQHSiWzAE+3OSt1KYEiTOupOhLrVnKFuKKzwgAToC4CAAMgJkA==
Date: Tue, 21 Feb 2017 19:26:56 +0000
Message-ID: <DM5PR05MB31459C513C132FE5B802F067D4510@DM5PR05MB3145.namprd05.prod.outlook.com>
References: <89d9ab4e-309f-d7f5-a2b7-ac79a663618b@nokia.com> <DM5PR05MB314525CD2AF52FA0A0FA6848D45D0@DM5PR05MB3145.namprd05.prod.outlook.com> <D4CC888A.1CBFC2%sajassi@cisco.com> <MWHPR05MB3151542E2BE28AFC77B5AD36D45C0@MWHPR05MB3151.namprd05.prod.outlook.com> <D4D08BF1.1CC281%sajassi@cisco.com>
In-Reply-To: <D4D08BF1.1CC281%sajassi@cisco.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: spf=none (sender IP is ) smtp.mailfrom=zzhang@juniper.net;
x-originating-ip: [66.129.241.13]
x-ms-office365-filtering-correlation-id: 8778826a-1e96-44b3-48c3-08d45a8f9992
x-ms-office365-filtering-ht: Tenant
x-microsoft-antispam: UriScan:; BCL:0; PCL:0; RULEID:(22001)(48565401081); SRVR:DM5PR05MB3145;
x-microsoft-exchange-diagnostics: 1; DM5PR05MB3145; 7:A/GejKtbHcbhvIxIpwf1lJYVHel4urTyQ2m4kGmGHbo/g+89h8RbNI9ehtmPkM4YYHR3rTyMBsyieEO0hssXNaaf8sjMQ7wY1RSx0+Bg5aqzd3x43boYrOx/u0Lbep+8wTdXnNZ6dyVTnkyPfah5iDGyu7nYjlBl33jXCbrae1iYZpP99D2OCJaKgEaELK5jnEbNPZjyKGVqgH7wTXd8gxpndNtNMBmApK9JuJAE6O+4W/4nQtwmmaqbwJmXXTlbhy5AG5aoSsopjkZSQ52iJZqMMNAoTOVtVoFHNRENIsP+j6mKArKWXexCaM1nK0RbQAWwsQxJy5jdszD0AsC6Yw==
x-microsoft-antispam-prvs: <DM5PR05MB3145DA9A28E1F665745FA83DD4510@DM5PR05MB3145.namprd05.prod.outlook.com>
x-exchange-antispam-report-test: UriScan:(278428928389397)(192374486261705)(138986009662008)(82608151540597)(788757137089)(95692535739014);
x-exchange-antispam-report-cfa-test: BCL:0; PCL:0; RULEID:(6040375)(601004)(2401047)(5005006)(8121501046)(3002001)(10201501046)(6055026)(6041248)(20161123555025)(20161123564025)(20161123562025)(20161123560025)(20161123558025)(6072148); SRVR:DM5PR05MB3145; BCL:0; PCL:0; RULEID:; SRVR:DM5PR05MB3145;
x-forefront-prvs: 0225B0D5BC
x-forefront-antispam-report: SFV:NSPM; SFS:(10019020)(6009001)(7916002)(39840400002)(39850400002)(39450400003)(39860400002)(39410400002)(199003)(189002)(13464003)(24454002)(377454003)(33656002)(92566002)(122556002)(7736002)(2906002)(93886004)(3280700002)(2950100002)(50986999)(76176999)(54356999)(3660700001)(97736004)(305945005)(189998001)(74316002)(106356001)(106116001)(105586002)(101416001)(2900100001)(53936002)(53546006)(6246003)(38730400002)(7696004)(6116002)(102836003)(3846002)(81166006)(81156014)(8676002)(230783001)(66066001)(8936002)(6436002)(229853002)(99286003)(68736007)(6506006)(5660300001)(5890100001)(55016002)(25786008)(77096006)(86362001)(9686003); DIR:OUT; SFP:1102; SCL:1; SRVR:DM5PR05MB3145; H:DM5PR05MB3145.namprd05.prod.outlook.com; FPR:; SPF:None; PTR:InfoNoRecords; A:1; MX:1; LANG:en;
received-spf: None (protection.outlook.com: juniper.net does not designate permitted sender hosts)
spamdiagnosticoutput: 1:99
spamdiagnosticmetadata: NSPM
Content-Type: text/plain; charset="ks_c_5601-1987"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-OriginatorOrg: juniper.net
X-MS-Exchange-CrossTenant-originalarrivaltime: 21 Feb 2017 19:26:56.8776 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: bea78b3c-4cdb-4130-854a-1d193232e5f4
X-MS-Exchange-Transport-CrossTenantHeadersStamped: DM5PR05MB3145
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/VhaXgfb0-XcdvDQr9esISk8l0eE>
Subject: Re: [bess] WG Last Call for draft-ietf-bess-evpn-inter-subnet-forwarding-03
X-BeenThere: bess@ietf.org
X-Mailman-Version: 2.1.17
Precedence: list
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bess>, <mailto:bess-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess/>
List-Post: <mailto:bess@ietf.org>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bess>, <mailto:bess-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 21 Feb 2017 19:27:02 -0000

Hi Ali,

Please see below.

> -----Original Message-----
> From: Ali Sajassi (sajassi) [mailto:sajassi@cisco.com]
> Sent: Tuesday, February 21, 2017 2:12 AM
> To: Jeffrey (Zhaohui) Zhang <zzhang@juniper.net>; Martin Vigoureux
> <martin.vigoureux@nokia.com>; BESS <bess@ietf.org>
> Subject: Re: [bess] WG Last Call for draft-ietf-bess-evpn-inter-subnet-
> forwarding-03
> 
> 
> Hi Jeffrey,
> 
> My comments in line ...
> 
> On 2/17/17, 9:09 PM, "Jeffrey (Zhaohui) Zhang" <zzhang@juniper.net> wrote:
> 
> >Hi Ali,
> >
> >> >Is it that besides their own default gateway, an additional common
> >> >gateway is advertised using that extended community? If so, what's the
> >> >purpose to call those different IP addresses on the IRB interfaces
> >> >"default gateway"? I assume the hosts will be using the common gateway
> >> >address?
> >>
> >> This comment should be already addressed by above clarification.
> >>
> >> >
> >> >   It is worth noting that if the applications that are running on the
> >> >   TS's are employing or relying on any form of MAC security, then the
> >> >   first model (i.e. using anycast addresses) would be required to
> >> >   ensure that the applications receive traffic from the same source
> >>MAC
> >> >   address that they are sending to.
> >> >
> >> >Why is that? As long as an NVE changes the source MAC to the one it
> >>sends
> >> >in the ARP reply, it should work even with the 2nd model?
> >>
> >> If it changes, yes but if it doesn¹t change it, it creates issue for MAC
> >> security. Modified the paragraph to provide further clarification:
> >>
> >> "It is worth noting that if the applications that are running on the
> >>TS's
> >> are employing or relying on any form of MAC security, then either the
> >> first model (i.e. using anycast MAC address) should be used to ensure
> >>that
> >> the applications receive traffic from the same IRB interface MAC address
> >> that they are sending to, or if the second model is used, then the IRB
> >> interface MAC address MUST be the one used in the initial ARP reply for
> >> that TS."
> >
> >I thought it was a given that the mac address in the ARP reply is the
> >same as the mac address that the device uses when it routes a packet out
> >of the IRB interface?
> 
> No, it not that simple if there is no ARP request upon a MAC move! VM1
> with (MAC1, IP1) first appears behind PE1 with IRB-x (MAC-x, and IP-x).
> Then it moves to PE2 with IRB―y (MAC-y, IP-x). The VM1 move may not
> trigger an ARP request. So, the packets from the VM1 with MAC-DA of MAC-x
> need to aliased to IRB-y. In the reverse direction, the packets from PE2
> to VM1 will typically have MAC-SA as MAC-y (IRB interface of PE2);
> however, if VM1 expects to receive packets with MAC-SA as MAC-x (for MAC
> sec. application), then PE2 needs to use the MAC address of IRB-x (on PE1)
> which is the MAC address of “initial ARP” initiated by PE1.

When PE1 did the initial ARP reply, it would not go to PE2, right? How does PE know to use IRB-x after the MAC move?
Does that mean that the 2nd option can't be used after all? That's fine as long as it is clearly explained.

> 
> >
> >> >
> >> >For section 4.2, the processing on the ingress and egress NVE is no
> >> >different from 4.1; the processing on the ASBRs is vanilla EVPN
> >> >forwarding and not specific to inter-subnet forwarding at all;
> >>therefore,
> >> >4.2 is not really needed? Additionally, the section is about "w/o GW",
> >> >yet the text talks about ingress/egress Gateway. It's better to replace
> >> >Gateway with ASBRs.
> >>
> >> Changed ³GW" in the diagram to ³ASBR².
> >
> >For conciseness, it may be better to simply remove 4.2 because of the
> >reasons mentioned above.
> 
> It is a different scenario than 4.1 and thus deserves different
> explanation. I may make reference 4.1 and remove duplicate texts.
> 
> >
> >> >
> >> >   ... This implies that the GW1 needs to keep the remote
> >> >   host MAC addresses along with the corresponding EVPN labels in the
> >> >   adjacency entries of the IP-VRF table (i.e., its ARP table).
> >> >
> >> >Does that mean GW1 needs to keep all type-2 IP/MAC addresses that GW2
> >> >learns from NVEs in DC2?
> >>
> >> Yes it does.
> >>
> >> >Also does that mean GW1 and GW2 must be attached to all subnets?
> >>
> >> That¹s correct.
> >>
> >> > If so, between the source NVE and its local GW, when the source NVE
> >> >route the traffic to the GW, I assume it's the destination subnet's IRB
> >> >interface's mac address that will be used as the source mac address,
> >>
> >> Correct. It is NVE1¹s destination subnet¹s IRB interface MAC address.
> >>
> >> >and GW1's IRB mac address for the destination subnet is used as the dst
> >> >mac address. It is true that an NVE may have the same system mac
> >>address
> >> >for all its IRB interfaces, but it's good to point out which IRB is
> >>used
> >> >(and if different IRB mac address is used, things will still work out).
> >>
> >> I update the section 4.3. Please take a look at it (refer to the
> >> attachment) and let me know if you have any further comments.
> >
> >I just realized that the spec only mentions a default route in the IP-VRF
> >but it does not mention how it is advertised. Is it advertised as IP-VPN
> >(SAFI 128) route or the unknown mac route in EVPN? If IP-VPN, there is
> >only one such route and it is not tied to the destination subnet; if it
> >is the unknown mac route, I assume it is one per subnet, but eventually
> >only one is selected as active, and it may not be tied to the destination
> >subnet? On the other hand, between the source NVE and its local GW, it
> >does not really matter which subnet is used. The text should be clear
> >though: a) how is the default route advertised; b) how is the traffic
> >forwarded between the source NVE and its local GW - looks like only IP
> >payload is needed (similar to the symmetric case)?
> 
> We need to look at this scenario with respect to both NVE and GW:
> From NVE perspective, the inter-subnet traffic within its own DC, is
> handle based on section 4.1. For inter-DC inter-subnet traffic, it is
> handled similar to symmetric IRB case.
> From GW perspective, the inter-subnet traffic between the GW is handled
> based on asymmetric IRB of 4.1.
> I will add some additional explanation.

Yes this is an important clarification.

> 
> >
> >I looked at the updated 4.3 but the text is still confusing.
> >
> >BTW - if an NVE/GW advertises an unknown mac route, when we install an IP
> >route to the IP VRF, should we install subnet's prefix route instead of
> >the default route?
> 
> GW should advertise default route to NVE - ie, in this scenario we want
> the NVE only maintain routes for their own DC and not any other DCs.
>
> >BTW - I could not figure out in which draft the unknown mac route is
> >defined. The virtual hub and spoke draft says:
> >
> >   [I-D.EVPN-overlay] defines the notion of the unknown MAC route for an
> >   EVI which is analogous to a VPN-IP default route for a VPN.
> >
> >But the EVPN overlay draft does not have the details. It only  mentions
> >unknown mac route.
> 
> I’ll add the reference for unknown MAC route.

Please include the details like how the default route is advertised. It is not clear whether it is via the unknown mac route. If yes, in theory, the unknown mac route should encode the BD's subnet address in the IP part?

Thanks.
Jeffrey

> 
> >
> >>
> >> >
> >> >Also, while each subnet is attached to each NVE, the IP routing process
> >> >(e.g. TTL decrement) happens twice - first on the source NVE and then
> >>on
> >> >the source GW. That's probably OK, but better point out all these
> >>details.
> >>
> >> TTL decrement is given when IP lookup is performed at each hop.
> >
> >True; my point is that when a router is connected to every subnet,
> >routing would only happen once. In this case, routing happens twice even
> >though every subnet is connected to every NVE.
> 
> It happens twice (one on the ingress NVE and another on the ingress GW).
> The reason it happens twice is the ingress NVE only has default route to
> ingress GW for all routes from other DCs.
> 
> Cheers,
> Ali
> 
> >
> >Jeffrey
> >