Re: [bess] comment on draft-ietf-bess-ir

Eric C Rosen <erosen@juniper.net> Wed, 23 September 2015 20:26 UTC

Return-Path: <erosen@juniper.net>
X-Original-To: bess@ietfa.amsl.com
Delivered-To: bess@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4D9371B2C46; Wed, 23 Sep 2015 13:26:18 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.902
X-Spam-Level:
X-Spam-Status: No, score=-1.902 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 52Z4lKWX05CN; Wed, 23 Sep 2015 13:26:14 -0700 (PDT)
Received: from na01-by2-obe.outbound.protection.outlook.com (mail-by2on0124.outbound.protection.outlook.com [207.46.100.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5CCE81B2C3C; Wed, 23 Sep 2015 13:26:12 -0700 (PDT)
Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=erosen@juniper.net;
Received: from [172.28.32.232] (66.129.241.14) by CY1PR0501MB1099.namprd05.prod.outlook.com (10.160.144.141) with Microsoft SMTP Server (TLS) id 15.1.268.17; Wed, 23 Sep 2015 20:26:10 +0000
To: Lucy yong <lucy.yong@huawei.com>, "Jeffrey (Zhaohui) Zhang" <zzhang@juniper.net>, "draft-ietf-bess-ir@ietf.org" <draft-ietf-bess-ir@ietf.org>
References: <2691CE0099834E4A9C5044EEC662BB9D571D7792@dfweml701-chm> <BLUPR0501MB1715C4449A20759976E757F5D46A0@BLUPR0501MB1715.namprd05.prod.outlook.com> <2691CE0099834E4A9C5044EEC662BB9D571D78FA@dfweml701-chm> <55E60759.3090502@juniper.net> <2691CE0099834E4A9C5044EEC662BB9D571D85E5@dfweml701-chm> <560007CF.6010206@juniper.net> <2691CE0099834E4A9C5044EEC662BB9D571F4F1E@dfweml701-chm> <56018DB6.2000202@juniper.net> <2691CE0099834E4A9C5044EEC662BB9D571F5CDD@dfweml701-chm>
From: Eric C Rosen <erosen@juniper.net>
Message-ID: <56030ADD.1080809@juniper.net>
Date: Wed, 23 Sep 2015 16:26:05 -0400
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:38.0) Gecko/20100101 Thunderbird/38.2.0
MIME-Version: 1.0
In-Reply-To: <2691CE0099834E4A9C5044EEC662BB9D571F5CDD@dfweml701-chm>
Content-Type: text/plain; charset="windows-1252"; format="flowed"
Content-Transfer-Encoding: 7bit
X-Originating-IP: [66.129.241.14]
X-ClientProxiedBy: CY1PR16CA0039.namprd16.prod.outlook.com (25.162.134.177) To CY1PR0501MB1099.namprd05.prod.outlook.com (25.160.144.141)
X-Microsoft-Exchange-Diagnostics: 1; CY1PR0501MB1099; 2:iqWkQaj8nzhPlAAftjyQdbwVY6QlqJ8rwrREKszF8xkWq9v8uwnRdxq2RVx0kKPw5b2s17G+JHAuxZ5txu9STqH3w/d+A3ASFwA+rQ/YdbCoA8dbv47u4CiscBVupddE7v+E8vi1WD15crHSxn5rL4iQFnJUO00HdsFGIsendEg=; 3:kvQ1BJXceBzAt9vdDbQaq8f94ZWaloPtMRRgXO4RPGopYvXlYePmIvmn5rFMgnVM75d6fMd079X52YHRNbGIBDBwIMukeFAVLdK/mHKb2vNl+q6XYw5cfYPRIDVAn0dk4ab/mA7Ex3ZJDAkTHqvR6Q==; 25:siynCoTQ/CGymii9Zay+rdPuYF8bc2zUr0uqQV2jrKOCRKO9zzwJWHm5K1i37Hh9q0qUEP+G4XDb5IrxTeeP6ucc5/VydTetLFF9wvkGBrBz8I4hTXGGJAFff1Hssw8C1lH5y4BIYMdWpLb8eS4pRA1y2sb6KMQ1dySf+ZBxYvL8kRGc9fozKyuvmbBLeLf34+hpH47BRo8XoRMCRQrq/mBlMSc+qJbRXOraPk0EClE8n2hLVcRflnuviWm+RYfmi2Y3KXnDv8aJ/ui21dNlSw==
X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:CY1PR0501MB1099;
X-Microsoft-Exchange-Diagnostics: 1; CY1PR0501MB1099; 20:rZczzN8tDheJpZSjDTz2DyODt0BWKvAJGtXjtL5VEIhpq/NVvxAcRbbTdraurMldSzz/dXO/OteGMRO8+W/cBFEd7pM/gx9Ak6q6DYgPjOnBJz2mayJz99fL/oRETGcUjtGBWRvfO2dFfKF6NzN6w2vseibrm3TCAdr09Uk67kTdLsGav6lr/2n8s/5udb5naJEEN5Gz6WVFZxVrZ9p/7rxf2PvcrBwvY+TSy+K8pYmcude18hCe4uE6ET/NlaaP4I644xni6zakNULPQ/87s5olvN/AhrQY73dGIfrbuvGxK481IE83FqzBuKeYN8ptgTIRwUA4/rUlemkBhxQtwrVaN+g9kNPsQULMA/D8d11Cq8+8P98tu4Xx8bFNcEqENQH6tJqNLDQ/767QhXpqNf7gO5tAxBpR5a6511hodpTMaVpNQSup24ebQbPE9XLRBq0MqUENksgpM1mq7YDksxyqq3EBjHaoBshQohiO5VO+nFbfC9lWuLF9ScS8DpVF; 4:NdK+Ino++gf2itU/nQZWHDDw9L6o9GIKoYkGV7sUuedGdIoQjQYSKCijoQY3paLjuZcamJglTYjrphg/ga3cPlHGETXVTYjWOwoAmRWOW5afAHIuifMZY3MC8QXFXLua0mbikLdHPscqwcvDCppwkQDhglk0+bPlTghO8P5O9w1yvufKGViecOEl4QSTnDj0hCemC+XUwmd0BlQDyMgHPHWQkxN4Iyc4DFF71coIst/qUZZV7i0GFnUXC+yLMbrwzbn7ZlgUjokEGA/ZMXg7u2gmsjM2LJU5auH3h89gjA81JwRtJ3UkXQM0A1/SjIX3BM1VccDsfYPYJsVVfcubJrKIbvQcBZbrOsnwsKDrdO8=
X-Microsoft-Antispam-PRVS: <CY1PR0501MB1099A9847C3E69DD2FB815DBD4440@CY1PR0501MB1099.namprd05.prod.outlook.com>
X-Exchange-Antispam-Report-Test: UriScan:;
X-Exchange-Antispam-Report-CFA-Test: BCL:0; PCL:0; RULEID:(601004)(2401047)(520078)(5005006)(8121501046)(3002001); SRVR:CY1PR0501MB1099; BCL:0; PCL:0; RULEID:; SRVR:CY1PR0501MB1099;
X-Forefront-PRVS: 07083FF734
X-Forefront-Antispam-Report: SFV:NSPM; SFS:(10019020)(6009001)(6049001)(189002)(199003)(42186005)(5001860100001)(77156002)(62966003)(230783001)(5001960100002)(46102003)(101416001)(106356001)(50986999)(1941001)(80316001)(87266999)(59896002)(65816999)(77096005)(93886004)(68736005)(76176999)(105586002)(5004730100002)(86362001)(92566002)(5007970100001)(54356999)(2950100001)(50466002)(189998001)(81156007)(4001350100001)(5001830100001)(4001540100001)(97736004)(99136001)(122386002)(47776003)(64706001)(64126003)(83506001)(65806001)(87976001)(561944003)(33656002)(66066001)(23746002)(36756003)(65956001)(2501003)(40100003)(5001770100001); DIR:OUT; SFP:1102; SCL:1; SRVR:CY1PR0501MB1099; H:[172.28.32.232]; FPR:; SPF:None; PTR:InfoNoRecords; MX:1; A:1; LANG:en;
Received-SPF: None (protection.outlook.com: juniper.net does not designate permitted sender hosts)
X-Microsoft-Exchange-Diagnostics: 1; CY1PR0501MB1099; 23:bZsiUiicBEJnp84LR423IBhEsMI48hNBMgOoHa0t+T4hxpLoY0xGK8v0uMZOZTqmk3vSy1S/bfbYa20gZ0n9WuQfTcBFtTNH+newqY78zbrUZbAI3sF4yOVKESzjA4BhyMvV2MR+dXrYINm/6fvAr+x957DVCE0mg6EeCvlLwoZh2ZYPPpCQN48grk5+FP5j06wH9uoceu1xCLH2MrHf5t4n2ZfgITy6IOcU8KVk0voAalWHW0cFWzHhJhmV79mflfIU9lhz9eFycar9z2rfCsVTeIspYPFCbhmtu6db2nl0ACecgXSkzByCo2NA+cztD7/d4aQm+NhyqHhVWDuh4GXek4mLTvOLd4SGcPpxggOfB5TddqJDokzOjg/y8P/KVLAefmXCybjbajDwYA6TXweY4c+biGN6ohyBKIQm0rFRHahOyCtnuRjmilIgXWPWQv2f/41uItKW2wgbc1jYTZwVKKEWeJCxH3CHxo21jhh6vuRso86neE3RPAoeEkOtyPFvoM9iz/llgU9sezp8wy6a1CIFYGOX7Yit9atzG+UE3PADGMzrHtEmuw2USrH6cCDQ/sJmDchrYQvi4/HmRUEVhGL/ZqbU9dRQsMETghlnhW11yBny6w1n7LR9O1tVZqD+HanGfAwf+XnOmYESS7DMaF/W8uCD3J2JjJ3Gd1CjP1UFRxZ7UP1nr4MsS8mBqajwJULFyDiv6+zAAUBoyI3L+sUZ0N9w0NwnQgnLDlH1XwPf3SbKHMQsYjSQMMzEirnXUJ+Cgq/RYUbr8vGythvJY6gYReac2ZxkatTyXQP/jiWk/GcVWQTU960zt/JBukR0pemd4J+VA/JQIdtjghn8a31nV/SMdeU8adVldnogi/qpgck2nUi/TZJcKUr4XOv58PHPIz4+kmAgweme4GIgDYerWTCUlkoB4KDMNyxGLtlRcG/yjzhuwwTXotj2TwRDurzNVErkmPReLWtD27Pm250i7i6mgfYmu/snClMlSSU0JZweyiiufo0CEZx/1DT1weS21kGwgsCsj2B53BUNmcjzdYv5rcXyjNmXg6c9HcJ89JAX4q4LKuHhfljZSrlQO7U1uabMMvIiHmIIAwTFg080Q4lTcjj+FmKP+KIPRqba7aIf+njvmV5CNv2XKZz43BWJdwEVMZJSNRX4M9KM4bcw7Abm0RDkuDQFssQ91xkLYSQwa7W/cpprhzBmvOGDEIfwjrzYICEglF6WnzagfSwK2sAGraObwz+VopyLWs1w5vEf8c5MgHoGh6vtUVf9GVzyj7Rwx75dmfLx3p88Fuuqml18mUyVxmS55G1kivgrnqBVePaVklOZMjOdZkx0fQDKBZWsPKIWaCgupOmpKKlhw7aAuT9ZnK506zdMwW/4aKiOp/TIvkClj4s+
X-Microsoft-Exchange-Diagnostics: 1; CY1PR0501MB1099; 5:6Qvq3r2nSJ7zECSGUqF1Ob4LJhUIAxq5T1LLaCcSxfnyugHd0Kxu6D8hfdpkHkWliAoFGkaqx5mg8QT0gC8xmYX2p/HXUnUTWv7NqFz+QWLWq2w63UBqRjEH+NEUV1QG+ExQ6dsWO25+DlVzmCyXww==; 24:7fuNE+s3WV2PVjdupBgyW1dVMFNyM9XBP4g6ZnjGSZfNQFaV7ip4Wd0YtFs/f2vdiTQyMNojCpCeHPBCV2MF0TCHPNy4YCosRpbXEWSfQ+o=; 20:rLnY6XzVC6dtMa1RT1ntMQGpDwv7qRDHfgRijO4X2gTNDSmyJq6F9molQ8i7fLBXfOIRGxbdz2Ds+Lo9dtyY2w==
X-OriginatorOrg: juniper.net
X-MS-Exchange-CrossTenant-OriginalArrivalTime: 23 Sep 2015 20:26:10.5903 (UTC)
X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted
X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY1PR0501MB1099
Archived-At: <http://mailarchive.ietf.org/arch/msg/bess/XNC7PxTaZoEfzGfis19hjl0pS4Q>
Cc: "bess@ietf.org" <bess@ietf.org>
Subject: Re: [bess] comment on draft-ietf-bess-ir
X-BeenThere: bess@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bess>, <mailto:bess-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess/>
List-Post: <mailto:bess@ietf.org>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bess>, <mailto:bess-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 23 Sep 2015 20:26:18 -0000

Lucy,

1. Constraining the distribution of Leaf A-D routes.

If you look at sections 9.2.3.2.1 and 9.2.3.4.1 of RFC 6514, you'll see 
that there are some rules that enable you to avoid sending a Leaf A-D 
route on an EBGP session unless a corresponding I/S-PMSI A-D route was 
received over that session.  There are similar rules in RFC 6514 
governing the distribution of C-multicast routes.  These rules are 
intended to prevent the Leaf A-D routes and C-multicast routes from 
being distributed more widely than necessary.  Whether these rules 
always work is questionable; they tend to have hidden assumptions about 
the deployment.

But if you want to investigate ways to optimize the distribution of the 
Leaf A-D routes, that's a good place to start.

One might try the following rule.  If R1 receives a Leaf A-D route, and 
if R1 is not identified in the route's RT, and if the Leaf A-D route has 
a route key that is the NLRI of an S-PMSI A-D route that R1 has 
installed, then only distribute the Leaf A-D route on a BGP session that 
leads to the BGP speaker that is the next hop of the S-PMSI A-D route. 
Whether this rule actually works in various deployment scenarios would 
require further investigation.

[Lucy] To suppress unnecessary redistribution, a P-tunnel BGP node 
tracks P-tunnel neighbor state. A BGP next hop is one of P-tunnel 
downstream neighbor, upstream neighbor, and N/A. The policy is, if the 
BGP next hop of the UPDATE of Leaf A-D route is the downstream neighbor, 
redistribution the route; if not, no redistribution.

I don't understand this proposal; I don't see how you can tell by 
examining the next hop of the Leaf A-D route whether you need to 
redistribute the route.  A rule based on the next hop of the 
corresponding I/S-PMSI A-D route sounds more promising.

Another approach would be to use Constrained Route Distribution.  This 
would ensure that the Leaf A-D route reaches its target, and would 
prevent the route from traveling over "unnecessary" alternate paths.  In 
certain deployment scenarios, ORF is also available as a way to prevent 
routes from being distributed unnecessarily.  Both these methods are 
forms of RT-based filtering, and both are independent of MVPN.

Of course, one also has to worry about creating a robustness problem if 
route distribution is constrained so that routes follow only one path.

Since the topic of this thread is "comment on draft-ietf-bess-ir", and 
since that draft is in WGLC, I'll just point out again that this issue 
is not specific to ingress replication.

[Lucy] IMO: this mechanism for membership announcement raises a BIG 
concern on the scalability and performance. Why is it not a concern for you?

I wouldn't say it's not a concern, but it's important not to focus 
exclusively on the worst case.  Typical deployment scenarios don't come 
close to the worst case, and there are various tools and filtering 
policies that can be used to constrain the distribution of updates based 
on the RTs.

2. Changing your parent on an IR tree

I think we have a disconnect here, having to do with the layering 
between the MVPN application and BGP.

MVPN can create a route and give it to BGP.  MVPN can set and modify 
attributes of the route.  MVPN can withdraw the route.  But the 
distribution of the route is controlled by BGP.

MVPN cannot tell BGP "send an update for NLRI X with attribute A1 on BGP 
session S1, but send an update for NLRI X with attribute A2 on BGP 
session S2".  MVPN cannot tell BGP "send an update for NLRI X on session 
S1 but send a withdraw for NLRI X on session S2."  And MVPN cannot 
control the timing of BGP's route distribution procedures.

In short, MVPN does not create and send the update messages.

[Lucy] To change the parent, a child sends out the UPDATE of Leaf A-D 
route with new parent address in RT.

MVPN can tell BGP to change the RT and the PMSI Tunnel attribute on a 
given Leaf A-D route.  Suppose MVPN replaces the RT so that the RT now 
identifies the new parent rather than the old one.

If Constrained Route Distribution is being used, this will cause an 
explicit withdraw to be sent to the old parent.  There is no way for the 
MVPN process in the child node to control the timing of this BGP message.

If Constrained Route Distribution is not being used, changing the RT 
will cause BGP to send a new update to the old parent as well as to the 
new parent.  The old parent will treat this as a replacement route, and 
will consider the old route to have been (implicitly) withdrawn.  This 
behavior is mandated by section 3.1 of RFC4271.  Since the old parent is 
not identified in the RT, the action it must take is the action 
specified for the withdrawal of a Leaf A-D route.

Now suppose MVPN doesn't simply replace the RT on the Leaf A-D route, 
but adds a second RT, identifying the new parent.  MVPN would also have 
to replace the PMSI Tunnel attribute, to specify a new label for the new 
parent to use.  The old parent would see this route as a replacement 
route.  The route still identifies the old parent, but has a new label 
in its PMSI Tunnel attribute.  So the old parent will continue sending 
traffic to the child, but will use the new label.  Now both old and new 
parents are using the same label, and the result will be data duplication.

I don't think there is any feasible way to switch parents in a "make 
before break" fashion without requiring the old parent to have some 
explicit knowledge that the switch is taking place.  The procedure we 
chose is to have the old parent time out the data plane entry for the 
child.  While this is not the only possible procedure, I don't think 
there is anything both (a) simpler and (b) compatible with BGP's route 
distribution procedures and with the layering between MVPN and BGP.

Of course, one could modify the MVPN/BGP layering by building more 
MVPN-specific knowledge into BGP, one one could even decide that section 
3.1 of RFC4271 shouldn't apply to Leaf A-D routes .  Certainly there are 
some cases where BGP knows that certain MCAST-VPN routes have to be 
handled in a special way.  But that's a lot more complicated than having 
the parent node simply run a timer to time out the data plane states 
when a route is withdrawn.

Eric