[Idr] 答复: draft-xu-idr-bgp-route-broker-01.txt

"xuxiaohu_ietf@hotmail.com" <xuxiaohu_ietf@hotmail.com> Wed, 02 August 2023 02:08 UTC

Return-Path: <xuxiaohu_ietf@hotmail.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 58358C151AEF for <idr@ietfa.amsl.com>; Tue, 1 Aug 2023 19:08:17 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.231
X-Spam-Level:
X-Spam-Status: No, score=-1.231 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FORGED_HOTMAIL_RCVD2=0.874, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=no autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=hotmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id lYhJcK6glQs0 for <idr@ietfa.amsl.com>; Tue, 1 Aug 2023 19:08:12 -0700 (PDT)
Received: from EUR05-AM6-obe.outbound.protection.outlook.com (mail-am6eur05olkn2054.outbound.protection.outlook.com [40.92.91.54]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 73741C151AF0 for <idr@ietf.org>; Tue, 1 Aug 2023 19:08:12 -0700 (PDT)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=L8iBp/65I5pN2isDLXEu2bfbBAWI8u8FXm2r1fwloLGXmDbB4ncJgGbGLEz4OBug2o+pmxhYY2vAw+i3FVbScHcWeJUCaVttG2YOYWNuGY1tYUS/YRBzhnM2Hw+GWdP7YoFVa9kTsSC0tlzKI7x+Kw8DbBIcqBeiItdxPxJdrw3yAFL6TvDTYqMO21VHCaHxoivHGb/dfxqfrhz8TVaXPBzIZzY7OJCnMc5e9cYcPmWvV3v+qpifz0DbLWAeAgMgAH5L/Nnq87e/vb0vAkzH9ksATKPpqbvddh8fEiUFzL4A8DpSRlJ4xY9ZwrAxX0qcoAe9n+6nIgRRzKCpkRuBjQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=6Jmkxfeq1X8A9AjD/374l0fWMq5zrEu41PYY1MkTksw=; b=RB8InGfq4lF2sKKFYsjS60jiQUYf6i1CoK3fsjB9xGGiBbf6nd8ra1JpnwTICuthYOOQO1/9gMrrqoOYZhx/Mic4vBr2cOJOvvZUjddbiASFS1rewTjeJomPQYTLlRhvaCJdFXOqn5Sh0vg8M65SJ1LgZZBycOLGeWWaLabT6FEZ6L/Z/GQrf/CXIzXnizVbuOPUCq5s7n8iMrY7FPXoBUHj6y5Cl0Gfl42yciEVKCDrVIGtiJrytrhu6gFk6ZRatHxe/2fgXgIdBLjU0yqmfM+ZZDJIa6V7HXXwh0kZ9vxtfTtpj/1ZZ3M9cEjoWbzGjJff+eUoKYD+VHG1MmSopg==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=none; dmarc=none; dkim=none; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=hotmail.com; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=6Jmkxfeq1X8A9AjD/374l0fWMq5zrEu41PYY1MkTksw=; b=MssG+29LIZ6yqvNv5TuLmROo9700x1UU+RG/oFR4T/YwICxXSWs6bi7wea0roEFaqEZNH4rXfIgZ3E6w6uy/JipLHbVOWYHoEdKr2RmsEPidxlzCQLp5I4C0sPgrXfygF8TbjarbZi44QDZY6/zjhPB9rbUfDOX8n4cKmj9gk6pHcj84nU2WOmxNnAsHhOlA7BkEVZReZbZaS+ONiGM2D9vzrBYFeFJrPdo6kBwQWXN8EBxxJQrjpCid4OnrP+EPPjvBEb+IEMBnOjXPLD+ubeSxQ8QcC9cmCkIorcmdSfhMKlLs/uBu47hJ+Aa/u9kip6gVdeJMcFXFIBq2Va8Reg==
Received: from AM6P192MB0375.EURP192.PROD.OUTLOOK.COM (2603:10a6:209:3b::17) by PAXP192MB1464.EURP192.PROD.OUTLOOK.COM (2603:10a6:102:287::9) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6631.42; Wed, 2 Aug 2023 02:08:10 +0000
Received: from AM6P192MB0375.EURP192.PROD.OUTLOOK.COM ([fe80::836e:71:9168:528e]) by AM6P192MB0375.EURP192.PROD.OUTLOOK.COM ([fe80::836e:71:9168:528e%6]) with mapi id 15.20.6631.045; Wed, 2 Aug 2023 02:08:09 +0000
From: "xuxiaohu_ietf@hotmail.com" <xuxiaohu_ietf@hotmail.com>
To: Robert Raszuk <robert@raszuk.net>
CC: Srihari Sangli <ssangli@juniper.net>, Shraddha Hegde <shraddha@juniper.net>, "idr@ietf. org" <idr@ietf.org>
Thread-Topic: draft-xu-idr-bgp-route-broker-01.txt
Thread-Index: AQHZxIG/Co4AVynOIk+iyOxtbHfSL6/VfYESgAATcACAAJUqfQ==
Date: Wed, 02 Aug 2023 02:08:09 +0000
Message-ID: <AM6P192MB037583BA1883D8F36935C8A8810BA@AM6P192MB0375.EURP192.PROD.OUTLOOK.COM>
References: <CAOj+MMHSRw9dTm7s5tKT0wrOjZ+_1xZ9hiHosR0ZTZPUFNVgvg@mail.gmail.com> <AM6P192MB03751808E3C151CC099C1C92810AA@AM6P192MB0375.EURP192.PROD.OUTLOOK.COM> <CAOj+MMEUdR1dwfYdZnYA=M6YDV_uKQcgWiYKMU1iozkFe8WWww@mail.gmail.com>
In-Reply-To: <CAOj+MMEUdR1dwfYdZnYA=M6YDV_uKQcgWiYKMU1iozkFe8WWww@mail.gmail.com>
Accept-Language: en-US
Content-Language: zh-CN
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-tmn: [7gLkU2J5cK4rzmvgEU0MYi8ookkIUHgj/JU16Zhe5dpvAtGfrYKjBkE87whqHlxI]
x-ms-publictraffictype: Email
x-ms-traffictypediagnostic: AM6P192MB0375:EE_|PAXP192MB1464:EE_
x-ms-office365-filtering-correlation-id: f24075d7-3722-4bf3-b56d-08db92fd51bd
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: tZVmDR84xWR4fSNfXNJiKDlRhrZldqJcuyQVKDkkUQvOWsPdMc5DXgRvoSpZd5PBAdfm10aBzwg/dtUh9k+uMMpxq5AlasajvX8fSIY+vmp+J2HuL58Me+iuV/pM/BkTw99zfHX1KAKaPJzwnfu07fzrNC6gVSo1ES/KtyfILtdPSm8c4ljltAZUC8rTY4ajig4DNAZuBjzYfMIObQeDrRmJAf0NtY0iNo49jKpbG1Ep7zlCZC2NUn68qXQyB0A+7pkAywx0uR8CyBQLE1QmtWR8NB3BefEcTCvtE58FU4/7hW+FK8TEkwc1zKBT60jgQa5Vmj2kIE1huBs9fil3JLXbnBEtNlv5TAnjj4WAMvmct6rhESZ11UZuxkDmv2DZfALzpggwSE8e7HK+cnxhPdfwnu4wWjLXnS50t2+6HJyX98g4OXhmz1OzZtY3cpnJKs7t3IlPlfMZC/UfmrKhTWpE+B4wIGibBRq+oEPMeACt9NRz+bkshQnC46wkTIQeg2G/Vd7I2Np+6Zm+lyxMYcEguzs9xMTZfacZU9oBMB5SIDE4CNqFXgh9skJV9icpu/WJTQWkT08uBsV2nI/YeofYl4vhp3ClHHWL0tfJ1Yo=
x-ms-exchange-antispam-messagedata-chunkcount: 1
x-ms-exchange-antispam-messagedata-0: JUoKJisGbfaqEUME3iNJhjbQ+wHAsGYNuvi1fptx3h8nFd9Z6/9I0T2Q6bsFGoikUe5GGar7m12QbKf1U+BQQEW1OoG2dAEOYu2EcQ8mh3Rd2DsSs123J6LYQAuTkPBV1xz2Jo3ZH4PXtc/MTwfEV3yJqGKfbUKHfnGzU6Dm9pTGbnbVLhTlgzM5CALFlpsjScEaH8eyqGNANQVuyxil5oFWkJ7zK09zAjMASC+/+MPvDIHDMcVqB36/2R3CJZQUiURArS2U1SQhIRGf3136hTLraDoZX8qAkfFDcUmAcvYSUP0ImUcWnJ2zm8Gxf+LB+uAaFj7OOsrDzWmE0TnYFlUNq+5w48suugc/dH3s7SYE/KYlxYfZuSHciW32WzcFOXKqjW0XkOeu+6wxE9N9G5aNbRnh8GP7lv8z7w3heBB0ZCxxxlLq5snigaVF5jDcnkAGQVZzkTGiJe5hMYOL442OKegqFPkfkj9tlYI5Qf4EnT28cmpzzpUNYP/u5G5fwIM+J6PzMNHUTlC1kK0ML21hYz1ou/8Xz+zv/SgB10/2ND/JnDwl29KwW5UcTH2gRm0BHszd4Qm47gricUgONZEKIKwULOyBlWHAO2hFLFfVf/0OxqvnXbdLUXapiJVzt8gnbJRJFaDnhg3LygZZSV6nbGs+8z1gb3XSYcFwY7uHS9BTc9+yKGjHqWtj/i6OTYAbZoLfwK6LFrHE2crlGWP3cVsohlg4dOm44QFNu9vgLuUQzXe6uI8jaFQXqCdgn3tlFIPYa8CZNusju999+EtUseuf+MYEwMZ9XwctZsfH3B23uPxHIQyuPC/waLjF+vgw9ZEz9E4G1eIkMY+Sz4n4LGQ2+rFn/5jjyfg4DcvTZDE5VoILyJz8n6YKgFvodVFR3cxo4KvjNLz3KYeoFJMnZ7tx8EcR8XQGhfLAKsEK3rner/4c3QNBKwEVYPjbDiRAQIZa72tzG9iHb/wPiCKh1eOjE7ZP69nJeS5MdIdF6LdO0uxZ3chjNwHEFDShZGsfzYJGMytosyJuEyLZp0Uyfyy91Lug4XfY+yBkF+zcJB+2Wpqi+qmORB54LAtKJnsMhNXBm1pryGOdDbx3Afu1E/KWW8BKLvBuvaAzl1tZiLw4loZDcKRYFG6CNhufjgPqtc10o1NuNaU8qyZzujbO0hMlm4nHSmLYi9gAqudUU977BNBWa0HjqhI575E/D1Sf1Ickf+Jwg+55LuBPkJduOSZdjWSq/v2RhPQ9ive4Znz6r+nqP3dkDqIxT+f+uuhYkwjnArB7R9P50tzsMxSg7/rZTu4ri8kuSSHb3nnDLSyCQADDDg3eoqDINo2P
Content-Type: multipart/alternative; boundary="_000_AM6P192MB037583BA1883D8F36935C8A8810BAAM6P192MB0375EURP_"
MIME-Version: 1.0
X-OriginatorOrg: sct-15-20-4755-11-msonline-outlook-fb43a.templateTenant
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: AM6P192MB0375.EURP192.PROD.OUTLOOK.COM
X-MS-Exchange-CrossTenant-RMS-PersistedConsumerOrg: 00000000-0000-0000-0000-000000000000
X-MS-Exchange-CrossTenant-Network-Message-Id: f24075d7-3722-4bf3-b56d-08db92fd51bd
X-MS-Exchange-CrossTenant-originalarrivaltime: 02 Aug 2023 02:08:09.8203 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 84df9e7f-e9f6-40af-b435-aaaaaaaaaaaa
X-MS-Exchange-CrossTenant-rms-persistedconsumerorg: 00000000-0000-0000-0000-000000000000
X-MS-Exchange-Transport-CrossTenantHeadersStamped: PAXP192MB1464
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/WLkFwND_TxPDJU9EXbkyDeJBEBo>
Subject: [Idr] 答复: draft-xu-idr-bgp-route-broker-01.txt
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 02 Aug 2023 02:08:17 -0000

Hi Robert,

发件人: Robert Raszuk <robert@raszuk.net>
日期: 星期二, 2023年8月1日 23:29
收件人: xuxiaohu_ietf@hotmail.com <xuxiaohu_ietf@hotmail.com>
抄送: Srihari Sangli <ssangli@juniper.net>, Shraddha Hegde <shraddha@juniper.net>, idr@ietf. org <idr@ietf.org>
主题: Re: draft-xu-idr-bgp-route-broker-01.txt
Hi Xiaohu,

Few follow up comments, but stay tuned for more :)

[Xiaohu] It actually leverages the route target constrained distribution mechanism😊

Cool. You have not mentioned it ... nor it is part of your reference section.

[Xiaohu] Good suggestion. I will add it as a reference in the next version.

[Xiaohu] If there was no permanent session between the publishers and subscribers, it seems that it’s hard to update the route tables on virtual PE routers in a real-time manner.

Not really. That is the whole idea of "subscription" to information.

[Xiaohu] AKAIK, even in the pub/sub system, it’s desirable for the brokers/exchanges to establish persistent connections with the publishers. See the following text quoted from RabbitMQ tutorial: (https://www.rabbitmq.com/publishers.html#:~:text=Publishers%20are%20often%20long%20lived%3A%20that%20is%2C%20throughout,long%20as%20their%20connection%20or%20even%20application%20runs.):

Publishers are often long lived: that is, throughout the lifetime of a publisher it publishes multiple messages. Opening a connection or channel (session) to publish a single message is not optimal. Publishers usually open their connection(s) during application startup. They often would live as long as their connection or even application runs.

[Xiaohu] AFAIK, RobbitMQ (https://www.rabbitmq.com/connections.html), as one of the most popular large-scale pub/sub systems, are built on TCP😊 Hence, the root cause of the scaling issue as mentioned above should not be TCP.

That is completely not my point. When I referred to TCP I meant BGP-4 TCP.

TCP is a great protocol ... HTTP/1.1 or HTTP/2 runs on TCP - but the crux of the matter is that the sessions are short lived. There is no (with few exceptions of active data streaming) keep the sessions for hours, days, months or years like in BGP.

[Xiaohu] In the data center network virtualization scenario where two-level RRs are deployed, the top-level RRs act as publishers, the bottom-level RRs, referred to as BGP route brokers, act as exchanges, and the virtual PE routers act as both publishers and subscribers. As a result, there should be long-lived connections between BGP route brokers and top-level RRs, also between BGP route brokers and virtual PE routers.


 #3 - You talk about overlay - but if there is overlay then usually there is also underlay. Dependency on underlay to detect nodes going down should be sufficient to trigger overlay invalidation.

[Xiaohu] Assume there are tens of thousands vPE routers, route aggregation on the underlay is a good practice. Furthermore, consider the VM migration operation or the frequent container creation/deletion operation, a proactive route distribution mechanism should be a must.

Even if you have reachability aggregation in the underlay there are known ways to detect the liveness of the peer.

[Xiaohu] I agree. BFD is a concrete example😊 However, the liveness of the peer doesn’t mean the routes originated from that peer are alive, think about the VM migration scenario.

#4 - You talk about PEs. But in modern L3 fabric MSDCs it is compute nodes (or actually even smartNICs) which are acting as PEs to their local tenants.

[Xiaohu] Agree. That’s the reason why the virtual PE concept is mentioned in the draft.

Ok. But also just think how your proposal works when vPEs or PEs are mobile nodes ....

[Xiaohu] Interesting, I have not yet thought about that case. Could you please say more about the scenarios where vPEs or Pes are mobile nodes?

So how about you think about DNS model or LISP mapping plane model instead here to distribuite overlay addresses to underlay next hop mappings ?

[Xiaohu] By using DNS or LISP, how could the virtual PE routers obtain the VPN route in a real-time maner?

By subscribing to interested information.

With HTTP/3 session establishment between known peers takes 0 RTT !  Just a hint :)

[Xiaohu] Interesting, this hint reminds me of the usage of XMPP as an alternative to BGP/Netconf in TF (also named as opencontrail)😊


So to summarize it is evident that you are really concerned about the number of TCP based BGP4 sessions.
Number of routes will still hit you as top level Route Reflectors (or Route Servers) will likely get requests to carry all VPN memberships.

[Xiaohu] there is no need to have any one top-level route reflector maintain all the VPN routes for all the VPNs supported by the data center. The scalability issue lies on the bottom-level route reflectors.

With dynamic VPN membership signalling you do not know that.

[Xiaohu] There is some details which are different from the RTC approach. Please refer to the following text quoted from section 3, “Top-level route reflectors, referred to as route servers, advertise route target membership information according to the preconfigured block of Route Targets. As such, route brokers know the VPNs associated with each of them. The route target membership information received form route servers SHOULD NOT be reflected by route brokers to any other iBGP peers further……The route target membership information received from route broker clients would be deemed by route brokers as an implicit route request for all the VPN routes for the VPNs associated to the corresponding route targets, and only need to be reflected towards the corresponding route servers which are associated with the VPNs associated with the advertised route targets.¶<https://datatracker.ietf.org/doc/html/draft-xu-idr-bgp-route-broker#section-3-2>”

See the other problem you are facing here is that today all L3VPN or EVPN deployments use static (management station driven) RT assignments.

Sure you can spread RT blocks between RRs, but this is what RTC has been suggesting nearly 20 years back :)

[Xiaohu] I’m very willing to reuse or leverage the existing mechanisms such as RTC.

Your proposal does mitigate the former by introducing the notion of "BGP Broker" - but instead of making half step I would encourage you to think further and use other then BGP protocol distribution mechanism.

[Xiaohu] Since BGP has been used as an overlay routing protocol by some open-source or commercial SDN solutions, such as TungsenFabric, it seems worthwhile to consider how to further improve the scalability of the BGP-based IP VPN technology when deploying it in hyperscale data center network virtualization environments.

I was sure you are going to quote Contrail here :)

Sure I am all for improving what's inefficient. But sometimes I like to also think outside of the box first. Will share a note which I have started to write about this problem space sometime back - but now your draft has fueled the new energy to finish it :)

[Xiaohu] It’s a real problem in the real world and therefore it’s worthwhile to deal with it😊

Best regards,
Xiaohu

Cheers,
Robert



Or if you like, go for using either hierarchy of route brokers with session less overlay transport.
[Xiaohu] I have not got your point of using a hierarchy of route brokers. Could you please say more?


Last but not least I am really surprised that with such proposal you are also not suggesting to use notion of aggregate withdraws as defined in https://www.ietf.org/archive/id/draft-raszuk-aggr-withdraw-00.txt

[Xiaohu] I have not yet found how to use the notion of aggregation withdraws to address the scalability issues as mentioned above. Let’s me think it further. However, it does use some similar trick (see Section 6) as yours.

Thanks a lot for your comments and suggestions again😊

Kind regards,
Robert