Re: [Idr] I-D Action: draft-kaliraj-idr-multinexthop-attribute-01.txt

Kaliraj Vairavakkalai <kaliraj@juniper.net> Tue, 06 July 2021 18:00 UTC

Return-Path: <kaliraj@juniper.net>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 6D3A93A2FDD for <idr@ietfa.amsl.com>; Tue, 6 Jul 2021 11:00:32 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.293
X-Spam-Level:
X-Spam-Status: No, score=-2.293 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.198, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=juniper.net header.b=J2p19KuR; dkim=pass (1024-bit key) header.d=juniper.net header.b=A9wtkp7r
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ki59SS0UNxyC for <idr@ietfa.amsl.com>; Tue, 6 Jul 2021 11:00:27 -0700 (PDT)
Received: from mx0a-00273201.pphosted.com (mx0a-00273201.pphosted.com [208.84.65.16]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 8F95D3A2FD8 for <idr@ietf.org>; Tue, 6 Jul 2021 11:00:27 -0700 (PDT)
Received: from pps.filterd (m0108158.ppops.net [127.0.0.1]) by mx0a-00273201.pphosted.com (8.16.0.43/8.16.0.43) with SMTP id 166Hwm9x016914; Tue, 6 Jul 2021 11:00:25 -0700
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=juniper.net; h=from : to : cc : subject : date : message-id : references : in-reply-to : content-type : mime-version; s=PPS1017; bh=pL8rbMJXMNo45YbYcxz0elqSLbsqA+4hzdDxsyrJUlg=; b=J2p19KuRzcwucvKjc7ESZqXBHoIdmJGcyWBW5gjDNch3gJyoPmhrGWNLmwsWYrwUFcsJ cxlcbk0D6HKmvuw8INjcVujnwoIy9+kEuXylhBEcqiiJVRWRxOTTZtzy/JeoQS1uEgqg gs6yiU+1qIQZNSmZrap8xWCsW4uVoAi49SLRwFXZUINQd7rU0TWUOYotrE3GI1P4tqMX 2d1yGSmgq1webu9Xafjnd0Hn+GJpt5C5BwPD1wXF2YA21470XlHYcTfq0+tWcfm5Eu2Q 1iTKJEroSaU5g2OnNd4V3LEtEtrD0HFEvwCx2d7bgJ7/kVN0Q8slvrLYt2I5nqNBt4Ai tg==
Received: from nam10-dm6-obe.outbound.protection.outlook.com (mail-dm6nam10lp2106.outbound.protection.outlook.com [104.47.58.106]) by mx0a-00273201.pphosted.com with ESMTP id 39mn2629t5-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 06 Jul 2021 11:00:25 -0700
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=nBL4Jwo/refKAmzcPZLsQkmvcbYyoPiJ/qY7OVTskdSMogx2zNUZn/ifJpZberuwz+h8pvNqn+XyO7jONjaogtfi01BP5AbrG+cSzkXqolLLOOG6qmVoXjX7VG/hN37BvJq4f+/bGQyGN864wvpVULP3Onkxpx8jfsMB3eUhl5xV3ZmSnqI+2BFCexUcrH1df2UWkZdGtsMV4xXTNoKr9b7qfS8KFJ15zvDbe3wStosyKgTwM4/dM06pYEkr/pcv49KzU8F0K04MmhT5Px/av4jisl+fPmJ/DmFjcoo2MqkHCE88jUGEkHHWtaLdyrgMD28dnisjPgS5pGfb6veWrQ==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=pL8rbMJXMNo45YbYcxz0elqSLbsqA+4hzdDxsyrJUlg=; b=nN7Cc27MCr6luYep5xR7RYD4krTnI7kMLRHXyuKbfxfvsTcQXPcDrzNWCxzBh2TH1EEb0tFote0XnLRJl27IrfNOdgmD3+9fnofvdUh5WyUHcOYuE3zBByUSY2kn3hCNU6bpgqD6caCDyVllfkRsmaqrHbtkPoMCTGqflCkmWNhYP+OgMSVXQBDLK1KMtwT5/D7uEmvYTD2Ym4ypoY5bq2QsiBzUSO1vwra8F+akPJjP/LaOGzpw2hs69Hl+PBywWLKqHLBJdfmCyvQiDiA17v0y9cnrfmoYbfKoNf8CETkRiMskqjwX5VSmoPWPY/wlaRoSieFE03a7dXsSb2vH4g==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=juniper.net; dmarc=pass action=none header.from=juniper.net; dkim=pass header.d=juniper.net; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=juniper.net; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=pL8rbMJXMNo45YbYcxz0elqSLbsqA+4hzdDxsyrJUlg=; b=A9wtkp7rn0klI797nIw+ZXxwf0O93gi8Nr11WJ+RXZGJqLbgXKaamNfQS/JAcs96pyij6Ia0EFxs1d6u+NAP5CgcToERaVsb5R2KWxi2nHNJzp0XLWNUPBKWt9Po3+YXnw3O14penka2YpB2xpk8yWnvWUbIRCISoWdn9WLvb5A=
Received: from MN2PR05MB6511.namprd05.prod.outlook.com (2603:10b6:208:da::13) by MN2PR05MB6447.namprd05.prod.outlook.com (2603:10b6:208:da::12) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4308.14; Tue, 6 Jul 2021 18:00:20 +0000
Received: from MN2PR05MB6511.namprd05.prod.outlook.com ([fe80::b4e8:8396:88ac:d75c]) by MN2PR05MB6511.namprd05.prod.outlook.com ([fe80::b4e8:8396:88ac:d75c%2]) with mapi id 15.20.4308.020; Tue, 6 Jul 2021 18:00:20 +0000
From: Kaliraj Vairavakkalai <kaliraj@juniper.net>
To: Robert Raszuk <robert@raszuk.net>
CC: Minto Jeyananth <minto@juniper.net>, "idr@ietf. org" <idr@ietf.org>
Thread-Topic: I-D Action: draft-kaliraj-idr-multinexthop-attribute-01.txt
Thread-Index: AQHXXqnzuyrnD7tHtUa4j+I1+RsQNKsw/dMrgABBvoCAAGfTCYAAI1wAgADB5+s=
Date: Tue, 06 Jul 2021 18:00:20 +0000
Message-ID: <DM6PR05MB6505E2185EEA0BA806FFF901A21D9@DM6PR05MB6505.namprd05.prod.outlook.com>
References: <162340175034.6148.8928864955067799770@ietfa.amsl.com> <CAOj+MMEG6vx7zAJcLAgyuXGPcuvuus=PU48aANJ93VKTLeV9dA@mail.gmail.com> <MN2PR05MB65117FDA62543A0A3A75B165A21E9@MN2PR05MB6511.namprd05.prod.outlook.com> <CAOj+MMFV=3xVzdr0M9VM+u2N_PYuOE0Q_t+2y1zoZdEYmhTS2w@mail.gmail.com> <MN2PR05MB65113F8B521421AE2E4829DFA21E9@MN2PR05MB6511.namprd05.prod.outlook.com>, <CAOj+MMHngcqSX-HrrbShPXHHUxLx7S4M+iuQhJ7_tsNrPkgKeg@mail.gmail.com>
In-Reply-To: <CAOj+MMHngcqSX-HrrbShPXHHUxLx7S4M+iuQhJ7_tsNrPkgKeg@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
msip_labels: MSIP_Label_0633b888-ae0d-4341-a75f-06e04137d755_Enabled=True; MSIP_Label_0633b888-ae0d-4341-a75f-06e04137d755_SiteId=bea78b3c-4cdb-4130-854a-1d193232e5f4; MSIP_Label_0633b888-ae0d-4341-a75f-06e04137d755_SetDate=2021-07-04T07:19:56.4057131Z; MSIP_Label_0633b888-ae0d-4341-a75f-06e04137d755_ContentBits=0; MSIP_Label_0633b888-ae0d-4341-a75f-06e04137d755_Method=Standard
authentication-results: raszuk.net; dkim=none (message not signed) header.d=none;raszuk.net; dmarc=none action=none header.from=juniper.net;
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: 13e1e33b-34b0-4f06-298f-08d940a7eba5
x-ms-traffictypediagnostic: MN2PR05MB6447:
x-ms-exchange-transport-forked: True
x-microsoft-antispam-prvs: <MN2PR05MB6447DE220B388BAF08F65E29A21B9@MN2PR05MB6447.namprd05.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-ms-exchange-senderadcheck: 1
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: UUCUqqrTv7tWaT+3B5ylKuId7Tnk6O/aXeXIDjoZKE3Y2Z5/fiNH8kbU5R6y32cuEdOTy1ehNmpEjYsgexyKd5ir3G2lvMT35nTSiC7nCFHAtxhfwU3ZL8L4H1EKU2QULzeTd60L08gsEiLRdWiGVGwl3PLEKz6vBdQeV3fg5Edw0rjlN7Gj77DsBUzF7HDItNv2x7sdkUdmqgxLKwZygPXeJF4GKHRTFkXaVxhskoEH4tiPyN7IonZF8sjIFkEhKWdhquWfD5AE2G0aAcS0cyBiHXjyXNVb6Oxy+bb2mtuTpy6NKyh6emMdfAFkp6iCH1S1ODCzWKiOIL9re/xvQTYqzBLP8bvYl13bOno6Az/Ki026NOXIwUizwrFkT3itcT0wrOAFjiydOh8VbY0FhDzZVS0bB9zg8P5O/7kCWbG6pEVtaLJTaSh0JlBiVAAXOVPrg620uJ6jkWrCD0+G44bTQjAU4X1++np5r45D5xkPxlwzvjqzN+KkYABmDd18uCWSoIydyMMpMekbEawJun+dZBWoodnnhqVv76iEmDwoMFryodKoFE1wAoIPEnrBHVjZ3qidgn8ZXxmDqb12JmII/IA9XQ5CkrT7ytyGfdQ6ExIASYQCUF6NWodVGN90WGK3tYrGkyAf18F7GR5oMg==
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MN2PR05MB6511.namprd05.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(136003)(346002)(39860400002)(366004)(396003)(376002)(8936002)(53546011)(66446008)(6506007)(52536014)(66476007)(38100700002)(66946007)(66556008)(86362001)(64756008)(316002)(6916009)(6512007)(9686003)(71200400001)(33656002)(83380400001)(5660300002)(4326008)(2906002)(8676002)(26005)(122000001)(478600001)(6486002)(54906003)(76116006)(186003); DIR:OUT; SFP:1102;
x-ms-exchange-antispam-messagedata-chunkcount: 1
x-ms-exchange-antispam-messagedata-0: HdX9gLd1nkO5tOFeTplFIOC2dDdI2WqkLRSj7mhBLKE1zphxxsglqy17dhDu2iXIqO9fvp/GUi0pVBArHpqUDVoXmqiygxeeARq51PqyEFyoBMuzRG9OtRj6a0bUdtJ/GmmLlbz2SSHwtUU7GcXgPskf8RYUQJJshaPG7eXmpJzjEi1MKiFNcNlE3ElfVEz8zJmF5SMgPDBgOUog4r8N86h0EKjfD7wngphEADNru/RoLeTu+RwXjLZHqbGrDGD2WF6dAHyeV53jYd6L2UmTdfCUmRfMNXZRfOZz1Qr97RWWwKZpMp2ER2DQ+JSxavqQEX1P/8dJE2C+ZFwsFxxQPGwo9i8Fcs1BW3f8WS254DZzMskSZTxV0DhsofZtIlYT33/gZbspboUE9UASxlgS4JSDen9zzMWZB9+DPAhXthBOtiwHVr6KqcPg3h2jEga+FBKb81EtBo7+gu6IsIQzYJrjPUz5DALTYz6lXJzBOoGl1GLHrZbddtkXVqBWkyVcu2XK6yhxVJ8LEU9LQxIrz+S51do0nDWcX0s5m2M/SXdRrBNo5PpazfgkBjc/hg14SW5+rqK2PI8J3SSiW3N2JgfiKD3hQe7zO/RI9xrjZrJtto3DmIoR2UV0suKVwdE6MF2YTYgm4IxXKG5/ygKiYppkL/bs/nrIDKTvHNdgCHTOWdTQm/DrPqHx3f0Xz5il6NxIrr/TwmGp7GO4qoCTJOhCmggoavj1rY07OTGcBDLanh9ITewd4bp3nZYJZx3xccDTbh1hKQzEqZa1RxjjFqF18mPW4zJ2M1FtL1bvtXjol01BNgek+fG5H45BTxBB6v2ppzo2mpeiw2wnYunOpvpfy4uxDfM05Ytc7ji7F6M5dM8C4X6shRIoW6jgUnUEzaHO4UmvDelFwjB/S7xqRpO/IgwLrbWpZeloVYGOtJjrUljf2WsIpNvBG9HVq6Z/D0A+emaGQV37VgH1hVRnmmDZHbtbb71udkKiu4OJFHLymL6mRJecOH4jWlcFPe8F3Jn3IdzpJIwgVR1zhGwBtuKyAYqS41c7DJjBNMZyAPNwXP5/p+2dMbeWUH0Jd0is8VIvhSA2s7bzZ6FTGpwA3moRrBzphTqJz2vx0wnm6KJj0woWLolGYdm2sLdTYqGDjIdRIRtrMzQYuLEBR8NAc2VT49ljVYkm7PUiXsDAOCSMLl1k39A5QpbyFsMrVBGPUfQDFWlTwt+T9SLgdjXdeOMflPx4DpR3XJyoXaxXNlvJ4IAWO2XplxzRBAawsl4j1ilg8uLIX0kDPUvdp1oDE+2oneAVxw/6PGDtRTKH6THi8MYENnNBZ/gLaUflUekl0JYBMdegLKhxysNaYUkAgQ==
Content-Type: multipart/alternative; boundary="_000_DM6PR05MB6505E2185EEA0BA806FFF901A21D9DM6PR05MB6505namp_"
MIME-Version: 1.0
X-OriginatorOrg: juniper.net
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: MN2PR05MB6511.namprd05.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 13e1e33b-34b0-4f06-298f-08d940a7eba5
X-MS-Exchange-CrossTenant-originalarrivaltime: 06 Jul 2021 18:00:20.5292 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: bea78b3c-4cdb-4130-854a-1d193232e5f4
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: OKS7QuhFrhRiWFHrQk4wAuKEzg2U1GE+vr1aR2YDNRCkLGPPcgApWig5W4NeW3zFzkU11NQUTGK+RERdjSJ10Q==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: MN2PR05MB6447
X-Proofpoint-ORIG-GUID: pcs7S5hxynhQ0pOIbY6lwigC4272y26S
X-Proofpoint-GUID: pcs7S5hxynhQ0pOIbY6lwigC4272y26S
X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:6.0.391, 18.0.790 definitions=2021-07-06_09:2021-07-06, 2021-07-06 signatures=0
X-Proofpoint-Spam-Details: rule=outbound_spam_notspam policy=outbound_spam score=0 mlxscore=0 phishscore=0 bulkscore=0 spamscore=0 priorityscore=1501 malwarescore=0 mlxlogscore=999 clxscore=1015 lowpriorityscore=0 suspectscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2104190000 definitions=main-2107060085
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/JS3lQqObrXanE-4r4SXfxgScywE>
Subject: Re: [Idr] I-D Action: draft-kaliraj-idr-multinexthop-attribute-01.txt
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 06 Jul 2021 18:00:33 -0000

> And how would you detect liveness of all of those parallel links ? Static + BFD ?

BFD to the BGP nexthop elements. (instead of the bgp-peer address).

> What's wrong with today's solution of using loopback for next hop + disable-connected-check knob and point static routes to resolve such next hop over parallel links ? Sure you can use more then one loopback here as well and that loopback is only significant locally over EBGP boundary.

Firstly, when not using ebgp-multihop and doing N parallel-links and N ebgp-sessions over them to be able to do ECMP/WECMP. In this model (call it model A), the RIB-size is increased N times, where N is the number of parallel links. With N=4 or 8 this may be tolerable today. But when N goes higher 128, 512, this model is un-manageable. So we need to go ebgp-multihop between the loopbacks (model B). But then we need an intermediate static route to a) provide connectivity and b) to resolve the BGP PNH over. And in that scenario, following problems exist for the “nexthop resolution (b) part”:


  *   Different service-routes cannot take different set of parallel-links. This would have been possible earlier in model-A.
  *   All service-routes need to share the same load-balancing treatment that is given to the static-route to the lo0. IOW, the per-service route different WECMP treatment that was earlier possible with model-A can no longer be guaranteed.

So  in essence, we don’t get the same functionality as model-A. Further,


  *   The EBGP routes use nexthop-resolution, because now the BGP PNHs are not directly connected. Which is not bad, but is additional load on the system. May be desirable to avoid.
  *   Traffic restoration on link-failure is no longer “a local-repair event confined to Forwarding-plane” because the PNH are not directly connected interface-addresses. The control-plane has to assist in detecting some unreachability events.
  *   Some platforms may not support preserving nexthop chain/hierarchy in forwarding plane, in which case convergence will take longer. All these problems are avoided if we just use a ECMP(Unlist nexthop) of directly-connected nexthops in forwarding-plane.

When using MNH (model C), we retain the nexthop-formation properties of model A, along with the RIB-size reduction properties of model-B. Use one ebgp-multihop session, and advertise N (e.g. 512) nexthops in the MNH attribute for a route, and 8 nexthops for another route. And each route can have its own ECMP/WECMP/Relative-Preference specified. So operator has better controls on which service routes to give what treatment on the ASBR boundary, thus can continue to monetize (like in model-A) those services being offered with reduced RIB-size consumption.

2> When such interface IngressIP1 or IP2 goes down you need to repair the encapsulation all the way in the edge instead of just performing local repair to get to the same loopback in milliseconds.

No. The repair happens at the nearest BN called the Forwarding-Helper. So the repair happens in sub-second. And actually this is orthogonal to whether we use IngressIP or lo0. In this case the IngressIP used as PNH are not visible outside this region at-all. So there is no way a convergence event propagates until the other edge-PE. The BN does a label-swap to the ingressIP to make forwarding happen. So it absorbs any failure event within this region.

And the BN needs more optimal forwarding information, such that intra-fabric hops can be avoided. This was a virtualization solution, where a physical box was split into multiple Virtual-Network-Functions. But the same model is followed when multiple physical boxes are involved too (e.g. N PEs and a pair of BNs in a region), and they are represented by one Anycast-nexthop to the rest of the network. The real PE-loopbacks are not visible outside their home region, thus reducing network-wide transport-layer scale.

Thanks
Kaliraj
From: Robert Raszuk <robert@raszuk.net>
Date: Saturday, July 3, 2021 at 12:46 PM
To: Kaliraj Vairavakkalai <kaliraj@juniper.net>
Cc: Minto Jeyananth <minto@juniper.net>, idr@ietf. org <idr@ietf.org>
Subject: Re: I-D Action: draft-kaliraj-idr-multinexthop-attribute-01.txt
[External Email. Be cautious of content]

Hi,

> KV2> EBGP is not discounted. One simple example where this can be useful for EBGP
> is the inter-AS parallel-links case. Even when using one bgp-session between loopbacks
> over the parallel-links, the MNH can be used to specify how the load-balancing should
> happen on the parallel links.

And how would you detect liveness of all of those parallel links ? Static + BFD ?

What's wrong with today's solution of using loopback for next hop + disable-connected-check knob and point static routes to resolve such next hop over parallel links ? Sure you can use more then one loopback here as well and that loopback is only significant locally over EBGP boundary.


KV2> there are some scenarios where it could be. The PE sends the RD:Pfx with MNH(IngressIP1, IngressIP2) thus attracting traffic to the more-optimal ingress-Forwarding-elements. Avoiding intra-fabric hops. The mechanism was invented to solve such a scenario only.

IMO this use case is a pretty bad idea. When such interface IngressIP1 or IP2 goes down you need to repair the encapsulation all the way in the edge instead of just performing local repair to get to the same loopback in milliseconds. It almost smells like some solution to fix platform deficiency in deployments of BGP as an IGP.

Tx,R.


Juniper Business Use Only