Re: [Idr] review comments on sections of draft-ietf-idr-bgp-ct

Kaliraj Vairavakkalai <kaliraj@juniper.net> Sun, 06 August 2023 23:08 UTC

Return-Path: <kaliraj@juniper.net>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8438FC151072; Sun, 6 Aug 2023 16:08:09 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.006
X-Spam-Level:
X-Spam-Status: No, score=-2.006 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=juniper.net header.b="0FKNu8T/"; dkim=pass (1024-bit key) header.d=juniper.net header.b="jWsDyrSP"
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vGLMyrUjoGLQ; Sun, 6 Aug 2023 16:08:04 -0700 (PDT)
Received: from mx0b-00273201.pphosted.com (mx0b-00273201.pphosted.com [67.231.152.164]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 7C216C14CE53; Sun, 6 Aug 2023 16:08:04 -0700 (PDT)
Received: from pps.filterd (m0108161.ppops.net [127.0.0.1]) by mx0b-00273201.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 376Lr4xh007003; Sun, 6 Aug 2023 16:08:03 -0700
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=juniper.net; h=from : to : subject : date : message-id : references : in-reply-to : content-type : mime-version; s=PPS1017; bh=9aD2knT2Zbo6v97ve95kkJr2alR86c0nbuFg/gc8Qzc=; b=0FKNu8T/dZdNVZ3U6YPSS0yz14q05QLSkgsZ50VOa0uuAXR9HVHKmE5onlRYuQKy9dRb vRcWD3jhaMrFfsXERMRNTuMt1d1nCIbZV1QBSSoHHYJyALDMr83v9GcJpkLIFvOrQRy5 tzWrZpaP9O7XFM1ausDpjaXrSjY11GKuQpyYmKNpdwj3KctnsvBlg19fUptwNT0eax1l 5zlUhQAjODVBl6eKsl+qc7ngyvg3Fu5zyJlD4VRSCHFiKxTvZVM4FSx88fZc3VeUJQ/+ NJca8hNdVN8WMvFMKzzsul0HfCPe3gmVkNubYdzZs4wQ7HrJmgl9lnxrzqkvHkgypfli eQ==
Received: from mw2pr02cu001.outbound.protection.outlook.com (mail-westus2azlp17012021.outbound.protection.outlook.com [40.93.10.21]) by mx0b-00273201.pphosted.com (PPS) with ESMTPS id 3s9nhx3ec1-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Sun, 06 Aug 2023 16:08:02 -0700
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=GMfqkxiDGcP6JDvi6WLSrnXURd4URiWxoZ3YfkV6CwN/h3l2u/NynWZAAxURJvPcOX1vpLoyphLOzaeIgHc8CCZ7Mn32BQ88WRB7je7ni3bJXVKXHNHDOF5C8vV6EzORg5SgZuhXmCwgUeDvcZYE4pWzp626o1BrMMHdHRIrv3rgd+3cqbJBVndE7XeSstljsBZ2DCMl388IeAIpE5pnrvkZNgy+fxXjXeXsHp4igQX8QU9gRGfQVYnYS33+Hy7t8ygUkP0MuHR1f7RQa7GwmG8nn9Lv5BYUPrRZq00VlPfl7oCZaG0rbpIJwVWi25GqWLmhloeE6zPrbuhWTHg0Eg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=9aD2knT2Zbo6v97ve95kkJr2alR86c0nbuFg/gc8Qzc=; b=O0pUISheT2fRA6bEjK6G2i4ErWXcmiF8gudr/J2jIwockT0Rz8Ky1fniQEtFURpRseJy2h473xJ2hc5BzpbuPy6vQhNne5VmZ2pxOpukV7mpzIeAdlj9OHncPuX3FfI4//Y0mTXbw88SZB0hU/Cs2GAGRc2vSjwjKHQIlLxyxs0u1RPo3A7vysJ5d6eYwgFPnwe90S0t8aYX/KjliqVdb15Wprzqa233Z2AtGoepCPUzjEI5Te5GVq1aBuhKFsV2x83nSwBV/Eem23keBy3GsVa+QQs7qGm2R8yQ4W/oWLUBp03PyJTfDjZry93+vtM+OFarn8RTKT6he2pDWJP25g==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=juniper.net; dmarc=pass action=none header.from=juniper.net; dkim=pass header.d=juniper.net; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=juniper.net; s=selector1; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=9aD2knT2Zbo6v97ve95kkJr2alR86c0nbuFg/gc8Qzc=; b=jWsDyrSPFgjRRMi/6pSX9etgCU33N5+ZAZcWVvij1CLBkBVPXPU7ZU/mW8ClzYmyU2pNcnck4h0cjLYfi1ujrNFinOG1NF2XOX7wmmQ6R7s3RlUREHLs9cjUeXFqtoVx82cKeIm/0evlCtDZ8yRc3CADbnc/5/Umw4VEE9pPqlI=
Received: from MW4PR05MB8634.namprd05.prod.outlook.com (2603:10b6:303:125::21) by SJ0PR05MB8599.namprd05.prod.outlook.com (2603:10b6:a03:38d::10) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.6652.26; Sun, 6 Aug 2023 23:07:58 +0000
Received: from MW4PR05MB8634.namprd05.prod.outlook.com ([fe80::7bbb:e61c:efce:36d8]) by MW4PR05MB8634.namprd05.prod.outlook.com ([fe80::7bbb:e61c:efce:36d8%5]) with mapi id 15.20.6652.026; Sun, 6 Aug 2023 23:07:58 +0000
From: Kaliraj Vairavakkalai <kaliraj@juniper.net>
To: "Swadesh Agrawal (swaagraw)" <swaagraw=40cisco.com@dmarc.ietf.org>, "idr@ietf.org" <idr@ietf.org>
Thread-Topic: review comments on sections of draft-ietf-idr-bgp-ct
Thread-Index: AQHZuzbQVxY1ZBJHokqhXZnpcPqFc6/FLbOpgAKuI8iABnbQZIAMV8TXgAMXx6U=
Importance: high
X-Priority: 1
Date: Sun, 06 Aug 2023 23:07:58 +0000
Message-ID: <SJ0PR05MB863208ABD773F326CFC1F032A20FA@SJ0PR05MB8632.namprd05.prod.outlook.com>
References: <BYAPR11MB28062824C6F0079144EF89B9C73EA@BYAPR11MB2806.namprd11.prod.outlook.com> <SJ0PR05MB8632C86A42D8D692D8BE7F80A23CA@SJ0PR05MB8632.namprd05.prod.outlook.com> <BYAPR11MB2806D2B21A8E74AF8EDB617EC73DA@BYAPR11MB2806.namprd11.prod.outlook.com> <SJ0PR05MB863283BC258204BCAE13F2C4A201A@SJ0PR05MB8632.namprd05.prod.outlook.com> <BYAPR11MB28066CA4A4755AB4A1AB4D0BC709A@BYAPR11MB2806.namprd11.prod.outlook.com>
In-Reply-To: <BYAPR11MB28066CA4A4755AB4A1AB4D0BC709A@BYAPR11MB2806.namprd11.prod.outlook.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
msip_labels: MSIP_Label_0633b888-ae0d-4341-a75f-06e04137d755_Enabled=True; MSIP_Label_0633b888-ae0d-4341-a75f-06e04137d755_SiteId=bea78b3c-4cdb-4130-854a-1d193232e5f4; MSIP_Label_0633b888-ae0d-4341-a75f-06e04137d755_SetDate=2023-07-22T04:07:43.5273010Z; MSIP_Label_0633b888-ae0d-4341-a75f-06e04137d755_ContentBits=0; MSIP_Label_0633b888-ae0d-4341-a75f-06e04137d755_Method=Standard
x-ms-publictraffictype: Email
x-ms-traffictypediagnostic: MW4PR05MB8634:EE_|SJ0PR05MB8599:EE_
x-ms-office365-filtering-correlation-id: 64ee94de-b257-47b0-eb43-08db96d1f99c
x-ms-exchange-senderadcheck: 1
x-ms-exchange-antispam-relay: 0
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: 3NdnamlZ/sLMRA6FCJTi8WUjYjGTOyOtGM5SI6zAL9ZxOWqmUtc9VmFWxo35uFRuxTpw4q2tkSAF4ZM3SvQxjKMLWRe2+UoeCVNaKy5uHf1Q762mpGCaQhXHCHBtMEcV5H7ddPMGG4UJmBidVwXBK8d1HXLUOKiZ47VNHEDR9Hfbzr5pwiObPBfWIJWmhZSQnxEufTBLpu3FRbG9ShOTDauJW8y88wvfWmrL68ziAUxKJvy+N6pHuRcfj3F9KBaksm/87ksPHB/OqPk/QKub5sG0HRiH+olxAKgbIJg6Bl9l3eqJo/K+PdZTMkWJsI2AvLqs0McxfsFxKBCotGDUz9o06VwP1AKcWS0R0Nt/gffVEu1vE4KCx/Vz+c8J2S0cgeG6JOVd6XGc0oZWSuXyTIRrEaxAkGpp9igNtq0D2Az3dDd5sjr8SivN9y8G/xzxNKCEc9uHoOc1XR6bdMXLgPTtdEN1QcOQa9jZ3h4yXLmIahvW78c9Jwf50jT2rG6HVdsK4ypS5amTnNiaofkFJLmomtFRf8z7cRCkly7c5U1TN321ysljC25joUOak1lWH7kDSX3M2GyL+nXHgYKlHeo2vzy4LQLEt0qqiNX1o6dQTwZP0p+6c+qFExYx7ne7oRIH+nKMfEawo31Bd72++Q==
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:MW4PR05MB8634.namprd05.prod.outlook.com; PTR:; CAT:NONE; SFS:(13230028)(366004)(136003)(396003)(39860400002)(376002)(346002)(451199021)(186006)(1800799003)(86362001)(33656002)(83380400001)(122000001)(166002)(38100700002)(38070700005)(966005)(6486002)(6512007)(9686003)(19627235002)(26005)(6506007)(53546011)(71200400001)(41300700001)(66946007)(64756008)(66556008)(66476007)(66446008)(76116006)(316002)(478600001)(110136005)(30864003)(2906002)(5660300002)(66574015)(8676002)(8936002)(52536014)(559001)(579004); DIR:OUT; SFP:1102;
x-ms-exchange-antispam-messagedata-chunkcount: 1
x-ms-exchange-antispam-messagedata-0: McDi3MmNP7a95jMnfvtbENBmVIArtL0GZztcP3YKnO3xDiR19wDTcJExEmfJKvu7NkaEl9OsgmrcMGWtRZNJqPilo8OM4n92JeSHZYbyR1K5JzR2PuWGdHW6Dn+yX8wk3DCDVuesGANY6hQ3acd1bmYirJPPJj8pAeY/6KmK7HWNHENMjqrgJ8ciChqnHeBVgr1t6jWLKjk5t5YyYjMGaY/wCX2BaClTiLcfMF55mN0vyRhjEO5NjjJ73O0j4nGw9r3nA+sPjTs28AWWGVHQ6yogMX/TiCjS2w4YjbupIIcETdMUWAzleZVqpNRoUPHp1pPbqaAq7UWeMeLtkz7krPeiSsu/AxhOc/bxnD3lyznrYyqtZ0yqriq8r7mDkjOt70YZZ4TFslgNBAwWsv/+b3R4Z9NW2ZEiH8BI4sNn18JaXmjzRxpWf0apZVKaR+7jpDWv3BGmKKDIYTN6yfJPNdFYxiAAFj0uSy01hYv2L55BLi0mYa49j4rn9s+SW4+ARpb6d8iM5cXS4n03NQegJXnin3gj/7V+zIjqSDo+QrgpIUuvZkGeCjOl8IRadT9FZjwlUNWZv9w0OgfyPKob7sr79buualK8XcR6VaJJoOX2BxjW7ITzggdWRfWyEFp8DxQB7pWET+Ava0iN2Q1xaDexvGb9J+eV3UQ4mKx6hQnXzyJKeImaDuDcd3mx0rSS2Vhy1NzHNtwVw8NRpD+CaAONDiYEEncZbkBhfCbvHI2lZtxYL6pJtlaDbZDu1ehJ3wz7lkuzdJrCOFHn2vWoZNg38La8IbChuQqZfDLk+BBg66IhNIFvCVI8JTn79Xj63EUnQMylsbG6IuEKQjtzUO51dMO9163wiHBwxfe/soJLwARpKRY88e0zEcqT2PyKyGuRqJBg/3mfQkX0LgnKKgFQkjfhNHLuMfc93lOVftJrOGPBKPMFM2K4beaqPxS1/uHC3fFodpB3kQq4K2P+xOIx3mPZy3oUFPMSigKZ7IVFzqQvNCckOh3U2w1AowCw8uU/Ce4203SVrIrAGFjOBES4veK6b8VqdMgE/K1zeeTPf2onC4Kk/M7XeVTCL94k56SaG4Db+VAg7J1KsOU5j2cCyxkOr73/mHijjwH+tOoCypHKOSV4H9RGfdq4Ho2CvBqNVO9nYxKr9v5mij1J5Oewii/7UIpknqLhskh+iYrcrbmm2Q+CmmBaXkuzt1EwcoXtV5LWM42koIoJSYZJByAaIFJITXk+SaopXE60xAn/bHPUAN08CnF5SvziHFhMWKg5opPBx5f8honC5Ow8Cit8SIflgAf92sHvpxI8lMh50FGE2iOtT3IjVvz3GqTc9mKUPmBg0WAGxZAmNxznUmPOpW3P3MrvigiaZJAeQOBB5SCK3j3Ic/tVJJVSi0+x5dYgdDXCvlyV8cUJ9IrWZmJKBGou7RKAWEsnEFVgRbJ+a0GJDK+6c6r1n/cpf2jSHXA81iu/Z9fyF/X5XUEvk6pn3qx4guoWaHQQRqSUlGISv2mooPp5Erjn/8LNXU2SueuosdLfKGdv477O9V9WFCIruDKQwZgEouyE+vkdjRl7tvGIBIwZETTdmwHkxz1kofkirCUBTO1/hhORe+5BHQ==
Content-Type: multipart/alternative; boundary="_000_SJ0PR05MB863208ABD773F326CFC1F032A20FASJ0PR05MB8632namp_"
MIME-Version: 1.0
X-OriginatorOrg: juniper.net
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: MW4PR05MB8634.namprd05.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: 64ee94de-b257-47b0-eb43-08db96d1f99c
X-MS-Exchange-CrossTenant-originalarrivaltime: 06 Aug 2023 23:07:58.2671 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: bea78b3c-4cdb-4130-854a-1d193232e5f4
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: ba4dnWOmAqzfwfZRN6al1TeQSN/9pDfRWwX0Jj9mY1UyilfO4Rw5AsxNzCXhp1MoW31PQl+97RzKh+qESObI6Q==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: SJ0PR05MB8599
X-Proofpoint-ORIG-GUID: TfL5TBJgNldC9HaJcEDM4TKon500KMPq
X-Proofpoint-GUID: TfL5TBJgNldC9HaJcEDM4TKon500KMPq
X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.267,Aquarius:18.0.957,Hydra:6.0.591,FMLib:17.11.176.26 definitions=2023-08-06_22,2023-08-03_01,2023-05-22_02
X-Proofpoint-Spam-Details: rule=outbound_spam_notspam policy=outbound_spam score=0 clxscore=1015 impostorscore=0 bulkscore=0 mlxscore=0 spamscore=0 phishscore=0 priorityscore=1501 malwarescore=0 lowpriorityscore=0 mlxlogscore=999 adultscore=0 suspectscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.12.0-2306200000 definitions=main-2308060221
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/a3zJ4y7eumYTU9Fc-xUJn9shtLU>
Subject: Re: [Idr] review comments on sections of draft-ietf-idr-bgp-ct
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sun, 06 Aug 2023 23:08:09 -0000

Swadesh,

At the risk of repeating myself again, let me try one last time. KV3>

Thanks
Kaliraj


Juniper Business Use Only
From: Swadesh Agrawal (swaagraw) <swaagraw=40cisco.com@dmarc.ietf.org>
Date: Friday, August 4, 2023 at 1:28 PM
To: Kaliraj Vairavakkalai <kaliraj@juniper.net>, idr@ietf.org <idr@ietf.org>
Subject: Re: review comments on sections of draft-ietf-idr-bgp-ct
[External Email. Be cautious of content]

Hi Kaliraj,

Thanks for the response. Just to capture a summary since the thread’s gotten quite nested.

There were 3 issues called out; neither of which have been addressed.
1.       The first one is a significant design limitation that the use of unique RDs results in:
a.       Lack of multipath and localized fast convergence within a domain for originator failures (even though an operator has deployed redundant routers)
b.       To achieve multipath, ‘stripping RD’ and ‘TC, IP’ allocation results advertisement of duplicate/redundant BGP routes with the same forwarding label
c.       which in turn increases control plane state on all routers upstream across multiple domains, and exposes the failure churn outside the originating domain all the way to ingress PEs in other domains
KV3> In your above text, you agree that 1.b solves 1.a, but at the cost of 1.c.
KV3> That increase in control plane state allows ingress-PE to determine how many egress-PEs are currently hosting an anycast service.
KV3> If that visibility is not desired, and an operator wants to reduce control plane state at ingress-PE, same RD may be used:
KV3> excerpt from Sec 6.3<https://www.ietf.org/archive/id/draft-ietf-idr-bgp-ct-12.html#section-6.3>:

                  Alternatively, the same RD may be provisioned for multiple
   originators of the same EP.  This mode can be used when the ingress
   does not require full visibility of all nodes originating an EP.

KV3> BGP CT allows operators full flexibility of achieving what a deployment needs.


This is not a new issue, it has existed since day-1 of CT and still remains, in spite of all the revisions of the draft.

This is not just an editorial issue. It is a significant deviation from currently deployed BGP-LU, which does not have these duplicate route/churn issues and provides multipath/active-backup within each local domain. It is a manifestation of the wrong data model of signaling RD in NLRI for BGP hop by hop routes.
The limitations need to be captured clearly in the draft as impact of the respective options, if they are not going to be addressed.

2.       The second issue is that of the inefficiency caused by the choice of the CT NLRI which only supports MPLS labels in the NLRI. Any use other than MPLS, such as SR prefix-SID (label-index) or SRv6 SID means every route needs to be sent in a separate BGP update message with no packing possible. The scale/performance test data completely ignores this issue, and shows data for a non-existent problem.

KV3> The update packing test results show numbers for a scenario where “Update packing does not happen”. It serves good for any reason
KV3> why Update packing may not happen (e.g. dissimilar aigp-attributes, bgp communities, loc-pref, SIDs, Color carrying communities).
KV3> The test result can be extrapolated to the other cases too, because all of them break update packing.
KV3> Everything cannot be carried in NLRI to micro-optimize update packing.

3.       The 3rd issue is that the draft introduces non-deterministic usage of IMPLICT NULL. It can result in mis-delivery of traffic and is an operational burden to make sure no MPLS path exists to next hop. This is again a result of CT mandating signaling label in NLRI even for non-MPLS encapsulation.

KV3> In Fig 10<https://www.ietf.org/archive/id/draft-ietf-idr-bgp-ct-12.html#figure-10>, if you have an ingress-node R1 that assumes to have successfully signaled
KV3> a MPLS transport tunnel to device R4 that does not support MPLS forwarding at all,
KV3> it is a bug in R1 outside the scope of BGP CT. e.g., if that ingress node R1 receives a AFI/SAFI:1/1
KV3> route with R4 as next hop, that will also result in mpls pkt ‘attempted’ to be sent towards R4.
KV3> That is not a problem in AFI/SAFI:1,1 procedures. Just a bug in R1 implementation.
KV3> Like stated already, Implicit-Null in a BGP route only says no MPLS-label need to be pushed
KV3> “at that BGP-route’s layer”. It does not make any assumptions about the transport-tunnel that
KV3> the route resolves over.

Further see my response in line with [SA2]

KV3> above responses cover these as-well. Thanks.

Regards
Swadesh




Juniper Business Use Only
From: Kaliraj Vairavakkalai <kaliraj=40juniper.net@dmarc.ietf.org>
Date: Thursday, July 27, 2023 at 6:24 PM
To: Swadesh Agrawal (swaagraw) <swaagraw@cisco.com>, idr@ietf.org <idr@ietf.org>
Subject: Re: review comments on sections of draft-ietf-idr-bgp-ct
Hi Swadesh, please see inline. KV2>

Thanks
Kaliraj


Juniper Business Use Only
From: Swadesh Agrawal (swaagraw) <swaagraw=40cisco.com@dmarc.ietf.org>
Date: Sunday, July 23, 2023 at 3:13 PM
To: Kaliraj Vairavakkalai <kaliraj@juniper.net>, idr@ietf.org <idr@ietf.org>
Subject: Re: review comments on sections of draft-ietf-idr-bgp-ct
[External Email. Be cautious of content]

Hi Kaliraj,

Thanks for responding. Please see my comments inline with [SA] for each of the response.

Regards
Swadesh



Juniper Business Use Only
From: Kaliraj Vairavakkalai <kaliraj=40juniper.net@dmarc.ietf.org>
Date: Friday, July 21, 2023 at 9:40 PM
To: Swadesh Agrawal (swaagraw) <swaagraw@cisco.com>, idr@ietf.org <idr@ietf.org>
Subject: Re: review comments on sections of draft-ietf-idr-bgp-ct
Swadesh, thanks for your review.

Inline pls.

Thanks
Kaliraj



Juniper Business Use Only
From: Idr <idr-bounces@ietf.org<mailto:idr-bounces@ietf.org>> on behalf of Swadesh Agrawal (swaagraw) <swaagraw=40cisco.com@dmarc.ietf.org<mailto:swaagraw=40cisco.com@dmarc.ietf.org>>
Date: Thursday, July 20, 2023 at 11:29 AM
To: idr@ietf.org<mailto:idr@ietf.org> <idr@ietf.org<mailto:idr@ietf.org>>
Subject: [Idr] review comments on sections of draft-ietf-idr-bgp-ct
[External Email. Be cautious of content]

Hi Sue,

Please see my review comments regarding a few sections as you requested.

Unique RD usage and caveats (related to F3-CT-Issue-6) :

It can be seen from Figure 13 rows 4 and 6, failure of an originator (such as ABR) will result in slow convergence as LSP is end to end and failure of originator needs to be propagated to ingress PE to converge.

KV> And from rows 1,3,5,7,9,11, that failure is not propagated until ingress-PE. This table is an exhaustive list of all possible combinations.
[SA] CT draft recommends use of unique RD. Hence, for the recommended case i.e rows 2,4,6,8,10 and 12,  convergence is slow as LSP is end to end and failure of originator needs to be propagated to ingress PE to converge. Further as per my understanding, control plane churn of CT routes is not localized for rows 1,3,5,7,9 and 11 as well. Please see next comment with explanation.

KV2> as stated in Figure-13<https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-ct-12*figure-13__;Iw!!NEt6yMaO-gk!DVIIudputQLE3ngp6d7lwhciFSJAImWdqhtBjYFRD86JYfSeT0K82wbpk51D7OLPZWDbYD0Cb96FGxCxvVDO6IWFDcXppGdq$>, CT provides a sliding control that operators can use to control route-visibility vs route-scale at ingress PE.
KV2> it is an exhaustive list of combinations. Unique RDs improve troubleshooting and route visibility, with an increase in ingress route-scale.
KV2> Duplicate RDs can be used to reduce ingress routes, if required, with limited egress-PEs visibility.

[SA2] Once again, you are attempting to pass off the serious limitation of end to end slow convergence due to unique RD as a route visibility and troubleshooting choice.  Moreover, the workaround of stripping RD at BRs does not reduce the control plane scale and churn either. In fact it exposes the problem of carrying same forwarding LSP information in redundant BGP routes from a device. There is no logical reason to incur this complexity and overhead.

[SA2] This is not a “sliding control” but a design problem with the CT NLRI data model. The CT draft should really capture the limitations for each of the rows of figure 13 as discussed in this thread, to allow operators to make the appropriate choices.

To avoid it "RD stripping" or “TC,EP” label allocation procedures at BNs is stated as an option in section 7.4. But even with that option, the control plane churn is still not kept within local domain as CT control plane is signaling redundant routes that carries same label.

KV> About the “churn not kept local” claim, not true. The advertised CT label does not change when local failure events happen and nexthop changes from one
KV> nexthop to another. Because of “TC, EP” label allocation mode.  So Churn is not propagated further than local BN. IOW, no new updates are sent and
KV> ingress-PE does not see this failure event.
[SA] Here is my understanding of CT procedures. Lets take example of figure topology 12 and figure 13 row 5 case. 4 PEs (PE11-PE14) have RDs PE11 to PE14 respectively as its unique RD case.  Anycast address is 1.1.1.1 across 4 PEs. CT routes(RD:IP) learnt on ASBR11 from originator PEs are PE11:1.1.1.1, PE12:1.1.1.1, PE13:1.1.1.1 and PE14:1.1.1.1 with same TC GOLD. ASBR 11 allocate label 16001 for (GOLD, 1.1.1.1) and advertise 4 routes PE11:1.1.1.1, PE12:1.1.1.1, PE13:1.1.1.1 and PE14:1.1.1.1 with label 16001 to PE31.
Issue 1: Above 4 routes carry exactly same label 16001 to PE31. This unnecessary control plane scale with same forwarding information.

KV2> That’s correct understanding of the behavior. but not an issue per-se. It provides visibility into how many
KV2> egress-PEs are currently serving the Anycast-service. Some of our customers see that as a good feature.

[SA2] This is not a feature but a design limitation with the CT NLRI. It results in unnecessary redundant BGP routes that increases end to end scale and churn without any functionality. This need to be fixed or called out clearly in the draft.

Issue 2: Now if PE11 goes down, ASBR11 need to withdraw (PE11:1.1.1.1) CT route from ingress PE31.
So local domain control plane churn is being propagated to PE31.

KV2> This is also regular BGP behavior. Not an issue. “Egress-PE down” case will be sent as BGP withdrawal
KV2> to ingress-PE in regular option-C (LU), except if you use MPLS-namespaces<https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/draft-ietf-idr-bgp-ct-12*name-context-protocol-nexthop-ad__;Iw!!NEt6yMaO-gk!DVIIudputQLE3ngp6d7lwhciFSJAImWdqhtBjYFRD86JYfSeT0K82wbpk51D7OLPZWDbYD0Cb96FGxCxvVDO6IWFDZS9p4w_$>.
KV2> I was talking about the cases where the churn is kept local for events like link-down or nexthop/label-changes.
KV2> Because of per(EP, TC) label allocation mode, a new label is not advertised.

[SA2] Discussion is for row 5 of figure 13 where redundant routers originate the given endpoint.  In such failures, BGP withdrawals need not be sent end to end but can be contained within the local domain because path is still available from redundant originator. The CT handling above is not regular BGP behavior

Row 5 shows for 16 routes there are only 2 labels advertised. Multiple redundant routes are advertised with same forwarding information and increases control plane state. This was the issue raised as a problem created by RD in NLRI. The impact aggravates as number of anycast originators increases.

KV> Please pay attention to rows 7-12 also. Which has only 2 or 4 routes advertised, with 2 unique labels. Operators have the flexibility to
KV> choose the desired visibility at ingress-PE, with the desired scaling characteristics. This table is an exhaustive list of all combinations.
KV> That helps operators to choose which mode fits their needs the best.
[SA] I did pay attention to 7-12 rows as well. Rows 7,8  and 11,12 are for the same RD case. This case defeats the draft’s stated purpose of using RD in NLRI. Rows 6 and 10 suffer from end to end slow convergence. Row 5 exposes the redundant route problem (16 routes for 2 forwarding state to ingress PE) and aggravates with increase in number of SNs. Same is with row 9 and aggravates as number of BNs increases.  (TC,EP) allocation scheme is not containing control plane churn within the domain as claimed in section 7.4; as I stated in my previous email.

KV2> same as above.

[SA2] Yes. And as stated in previous comment it’s an issue and needs to captured in draft for each row of figure 13.

The updates to the draft do not address the raised issue. However, it states (in sec 7.4) that route churn is avoided, and is proportional to number of labels but that is not the case as explained above.

As a related observation, there was a solution for above issue proposed by authors on the list to use local RD of BN node when “Stripping RD”. However, it looks like that solution has been discarded as its not discussed in the draft.

BGP-CT-UPDATE-PACKING-TEST results included are for an unrealistic scenario in practice; and also do not cover relevant deployment cases :

For example it captures 1.9 million BGP CT MPLS routes packed in 7851 update messages. That means about 250 routes sharing attributes and packing every update message completely. It seems test is done with all routes (around 400k) for a given color having exactly same attributes. This is not a practical example. A more practical case would be to have a packing ratio, for example 5-6 routes to a set of attributes.

KV> The goal of the experiment is to see the impact of carrying ‘Color as an Attribute’ as against ‘Color in NLRI’.
KV> The issue raised was that, carrying color as attribute will affect packing.
KV> So this experiment demonstrates that the observed convergence time is in accepted limits, even when color is carried as an attribute.
KV> In any controlled experiment, we want to vary one variable to observe the result.
[SA] I am not sure of such discussion. The observed issue was for label index and SRv6 SID that are per-prefix information with RFC 8277 style encoding that carries such information in attribute and breaks BGP update packing. In any case, it will be helpful to have such analysis of BGP CT for the WG.

More importantly, the test results do not include or analyze impact of label index, SRv6 SID etc. that are per-prefix information.

KV> This experiment provides actual benchmarking test results for one case (color as an attribute), that can be extrapolated for
KV> other cases as-well where SID(label-index/SRv6) is carried as attribute, just like the Color.

[SA] Just to reiterate, Color was not the discussion point for update packing. With just 5 colors across 1.9 millions routes, nobody sees update packing as a concern.

KV2> OK. Btw, these scale requirements are from https://datatracker.ietf.org/doc/html/draft-hr-spring-intentaware-routing-using-color-01#section-6.3.2<https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/draft-hr-spring-intentaware-routing-using-color-01*section-6.3.2__;Iw!!NEt6yMaO-gk!DVIIudputQLE3ngp6d7lwhciFSJAImWdqhtBjYFRD86JYfSeT0K82wbpk51D7OLPZWDbYD0Cb96FGxCxvVDO6IWFDRCHKcFh$>

The concern arises for label index and SRv6 SID that is per prefix information carried in attribute in BGP CT encoding. This breaks packing and has an impact.

KV2> This experiment shows the numbers with color carried as attribute. Results will be comparable when
KV2> label-index/SRv6-SID/TEA are carried as attribute as-well. IMHO, it may not be worth carrying all those
KV2> in the NLRI, in an attempt to micro-optimize this further.

[SA2] It is not a good choice for a new BGP SAFI to be designed such that every prefix needs to be carried in separate update message when using label index(SR MPLS) and SRv6. It increases BGP control plane data size by multiple fold. Problems will be seen in scaled networks.

[SA2] Colors carried in attribute was never a problem from update packing point of view and not an issue called out in adoption call. The issue was raised for label-index and SRv6 SID that are per prefix information. Current results provided is of no practical use.

Non deterministic usage of IMPLICIT NULL :

Implicit NULL is a valid MPLS label and indicates no label to push by receiver. Label path to BGP nexthop is still valid/expected.
KV> intra-domain tunneled path to the BGP nexthop may or may-not be labeled. Implicit-Null label carried in BGP-LU/BGP-CT route
KV> doesn’t claim anything about the intra-domain tunnel. It just says no BGP-LU/BGP-CT label needs to be pushed in forwarding.
[SA] Thanks for clarification on procedure. But when I read draft, it indicates towards new meaning of IMPLICT NULL. Quoting exact text in draft “R4 will carry the special MPLS Label with value 3 (Implicit-NULL) in RFC 8277 encoding, which tells R1 not to push any MPLS label towards R4”. It will be better to update your response text in the draft.
 Section 13.2.2.1 is extending implicit NULL label presence to indicate that originator does not support MPLS. This is not possible as the two cases cannot be distinguished.

KV2> Sure, will clarify the text to say, “Implicit-Null label carried in BGP-LU/BGP-CT route indicates that
KV2> no BGP-LU/BGP-CT label is pushed in forwarding”.

[SA2] Thanks. But mis delivery of traffic is possible if an MPLS tunnel exists to next hop with this procedure. This should be captured in the draft.
KV> so, there is no ambiguity. Implicit-NULL is only saying no BGP-LU/BGP-CT label needs to be pushed in forwarding.
[SA] Same response as previous point.


For example in figure 11 and 12 not sure why R3 won’t send MPLS traffic to R4 as stated in last paragraph. Similar is the problem with section 13.2.2.2.

KV> as shown in these figures, R4 does not support MPLS. So there can be no MPLS-tunnel from R3->R4.
KV> so why would R3 send MPLS traffic to R4? When R3 tries to resolve PNH==R4, it will find no matching
KV> MPLS tunnel, and the route will remain Unusable.
[SA]  It’s an operational burden to make sure that no router has MPLS path to R4 (MPLS path can be for other purposes). Otherwise there can be mis-forwarding with IMPLICIT-NULL in 8277 style encoding for non MPLS encapsulation signaling (SRv6, UDP) in BGP CT. It should be captured in the draft.

KV2> R4 does not support MPLS. So there can be no MPLS path towards it. There is no operational burden.
KV2> Thanks for the comments.

[SA2] Previous point response applies here as well.

Regards
Swadesh