[Idr] Re: New Version Notification for draft-xu-idr-fare-01.txt

Jeff Tantsura <jefftant.ietf@gmail.com> Fri, 26 July 2024 18:28 UTC

Return-Path: <jefftant.ietf@gmail.com>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 035F6C14F701 for <idr@ietfa.amsl.com>; Fri, 26 Jul 2024 11:28:41 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.104
X-Spam-Level:
X-Spam-Status: No, score=-2.104 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id K-zSXwbKFgPk for <idr@ietfa.amsl.com>; Fri, 26 Jul 2024 11:28:37 -0700 (PDT)
Received: from mail-pf1-x432.google.com (mail-pf1-x432.google.com [IPv6:2607:f8b0:4864:20::432]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 33638C14F6BC for <idr@ietf.org>; Fri, 26 Jul 2024 11:28:37 -0700 (PDT)
Received: by mail-pf1-x432.google.com with SMTP id d2e1a72fcca58-70d28023accso1079009b3a.0 for <idr@ietf.org>; Fri, 26 Jul 2024 11:28:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1722018516; x=1722623316; darn=ietf.org; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=sQt5eFP0903P2KeBe+grTGMHTcTp+XVaseJQ/3uA7IE=; b=nRasVuSFsYyniej/pmDQGgQVexMZUywCyyz/VnKfJSlMPDQX2RW53VJ4sTBK3oaJKx b5vdhZ4ZCQKj9+3OoiWucsYazk4k6DIG5Cptdq5ujm5y6eZyGAsboY2oNQ2cDQnPtcd5 nulQlEs1PmNw4x51Cax00n2yx9/KdpS9Onf7282SsvSE2HDEUOIwODOeEYDUU37Ec0yL 6wyzC3OzZ4mDGO6aQcsu0vINDHI85cTTZXkuFdOQahT0mAZUoY9h9iarE+6cQz+rWz2V gg6cBEu56D915Z8yZfcBlRzxWrowI9ZFHjFwo5VXVINkcOVRV7Ocw0emZdx/d+2unf+Y m+cA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1722018516; x=1722623316; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=sQt5eFP0903P2KeBe+grTGMHTcTp+XVaseJQ/3uA7IE=; b=tKsVjSwwoqvQXBJgeAHDXO92P5dkzX6ThpCqecoJSk2UMB605EU2bmDIbc5ltC6Ip4 e5lpjK/z+1Zb12PTle33A6cK3D5DhGeBvRPABFY8db6qqj/1ZPHuLzyrWq1EMgDneXba xU79BdTZ5/eI9tOV/5THdoWlEC32O2UkopCssGQZtAW4PCV7qMIgoSikawAH4wAdAEVc fpRure5vEzowbhgKl7FRIefAT53hUPkEbxx5Gcz6S423vLw+ykRFDDjyg14JWR//RMTn yx28VeqCwVpUkvMPpx3bhqHompvLQYdKZVwBB3lEIP8qsio6YDErfMHpQAaKUdW0mYJz 59Mw==
X-Gm-Message-State: AOJu0YweRXXtMfMuDxhISFhzInW1oERhHWDymdB5oe1LohsE6zqtmAy4 VRORSAd6xsmL/fCTA6g9NjliYue9XJ+6gdi2VNLRNQr62JlbZNschKz9PVIoRhs=
X-Google-Smtp-Source: AGHT+IH9dmR21vFssm+RtGNe4fAMkcgQ/COaS6/iPnt+fmSF6aAArfxRl6K2C+xZCam1dTP06VAUDA==
X-Received: by 2002:a17:903:228d:b0:1fd:93d2:fba4 with SMTP id d9443c01a7336-1ff048e4f38mr4928535ad.48.1722018515587; Fri, 26 Jul 2024 11:28:35 -0700 (PDT)
Received: from smtpclient.apple ([2001:67c:370:128:39b3:a970:314:4205]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-1fed7ee15a9sm36031805ad.151.2024.07.26.11.28.35 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Fri, 26 Jul 2024 11:28:35 -0700 (PDT)
From: Jeff Tantsura <jefftant.ietf@gmail.com>
Message-Id: <96F94CD4-CB5F-4562-9AAB-2A73F596DFA9@gmail.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_D926E3E1-3024-4380-BC0E-E175D1882B70"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.600.62\))
Date: Fri, 26 Jul 2024 11:28:24 -0700
In-Reply-To: <AM7P192MB0707B685B31E9EDFF87EB38981A92@AM7P192MB0707.EURP192.PROD.OUTLOOK.COM>
To: IETF IDR WG <idr@ietf.org>
References: <172173663497.519307.578529842041079217@dt-datatracker-659f84ff76-9wqgv> <AM7P192MB0707B685B31E9EDFF87EB38981A92@AM7P192MB0707.EURP192.PROD.OUTLOOK.COM>
X-Mailer: Apple Mail (2.3774.600.62)
Message-ID-Hash: EYMCJNNFV5RFSWIBELLYKSE6HFU7SW44
X-Message-ID-Hash: EYMCJNNFV5RFSWIBELLYKSE6HFU7SW44
X-MailFrom: jefftant.ietf@gmail.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-idr.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
X-Mailman-Version: 3.3.9rc4
Precedence: list
Subject: [Idr] Re: New Version Notification for draft-xu-idr-fare-01.txt
List-Id: Inter-Domain Routing <idr.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/W-0LIZTnFlFymU3cbdcVUO7UIPA>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Owner: <mailto:idr-owner@ietf.org>
List-Post: <mailto:idr@ietf.org>
List-Subscribe: <mailto:idr-join@ietf.org>
List-Unsubscribe: <mailto:idr-leave@ietf.org>

Hi,

I’m quite surprised not to see any references to draft-ietf-bess-ebgp-dmz, that has enjoyed wide deployment in hyperscale networks and AI clusters, and is available in at least 5 implementations.
I believe, it is solving all the shortcomings of the initial link-bandwidth draft. Transitivity of the community is left to the implementation and SHOULD be configurable, since the community has to be regenerated at each hop (to provide the cumulative behavior) either would work, the only requirement is that the receiver should not make assumptions about transitivity and accept either - by now the behavior of all major BGP implementations.

We have been discussing normatively making the community transitive, either in the original draft or in draft-ietf-bess-ebgp-dmz (and changing the intended status), this should not affect existing behavior and can be changed as per implementation if needed.

Editorial:
Adaptive routing is a misleading and incorrect term to use - it is called W-ECMP for a reason, there’s no changes to the routing, only forwarding weight rebalancing.
Introduction section is inaccurate and makes assumptions wrt rail domain size, traffic patters and some other artifacts of an AI cluster.

Distributing congestion information and/or BW available (total BW - used) in BGP is an extremely bad (and given the characteristics of the feedback loop) rather harmful idea.

At this point in time I’m not going to go into specifics of the proposal.

I’m against the progress of this document, and don't see a need for yet another solution that also introduces non backwards compatible changes to a widely deployed behavior.

Thanks,
Jeff

> On Jul 23, 2024, at 05:21, Tiger Xu <xuxiaohu_ietf@hotmail.com> wrote:
> 
> 
> Hi all,
> 
> [I-D.ietf-idr-link-bandwidth] has specified a way to perform weighted ECMP based on link bandwidths conveyed in the non-transitive link bandwith extended community.  However, it is impractical to enable adaptive routing in a 5-stage CLOS network where eBGP is used as the underlay routing protocol by directly using the non-transitive link bandwidth extended community due to the following constraints as mentioned in [I-D.ietf-idr-link-bandwidth].
> 
>    "No more than one link bandwidth extended community SHALL be attached
>    to a route.  Additionally, if a route is received with link bandwidth
>    extended community and the BGP speaker sets itself as next-hop while
>    announcing that route to other peers, the link bandwidth extended
>    community should be removed.  The extended community is optional non-
>    transitive."
> 
> Hence, this document defines a new extended community referred to as Path Bandwidth Extended Community and describes how to use this newly defined path bandwidth extended community to achieve adaptive routing in a 5-stage CLOS network.
> 
> Any comments or suggestions are welcome.
> 
> Best regards,
> Xiaohu
> 
> 发件人: internet-drafts@ietf.org <mailto:internet-drafts@ietf.org> <internet-drafts@ietf.org <mailto:internet-drafts@ietf.org>>
> 日期: 星期二, 2024年7月23日 20:10
> 收件人: Hang Wu <wuhang@ruijie.com.cn <mailto:wuhang@ruijie.com.cn>>, Hongyi Huang <hongyi.huang@huawei.com <mailto:hongyi.huang@huawei.com>>, Junjie Wang <wangjj@centec.com <mailto:wangjj@centec.com>>, Peilong Wang <wangpeilong01@baidu.com <mailto:wangpeilong01@baidu.com>>, Qingliang Zhang <zhangqingliang@h3c.com <mailto:zhangqingliang@h3c.com>>, Shraddha Hegde <shraddha@juniper.net <mailto:shraddha@juniper.net>>, Tiezheng <litiezheng@ieisystem.com <mailto:litiezheng@ieisystem.com>>, Xiaohu Xu <xuxiaohu_ietf@hotmail.com <mailto:xuxiaohu_ietf@hotmail.com>>, Yadong Liu <zeepliu@tencent.com <mailto:zeepliu@tencent.com>>, Yinben Xia <forestxia@tencent.com <mailto:forestxia@tencent.com>>, Zongying He <zongying.he@broadcom.com <mailto:zongying.he@broadcom.com>>
> 主题: New Version Notification for draft-xu-idr-fare-01.txt
> 
> A new version of Internet-Draft draft-xu-idr-fare-01.txt has been successfully
> submitted by Xiaohu Xu and posted to the
> IETF repository.
> 
> Name:     draft-xu-idr-fare
> Revision: 01
> Title:    Fully Adaptive Routing Ethernet using BGP
> Date:     2024-07-21
> Group:    Individual Submission
> Pages:    11
> URL:      https://www.ietf.org/archive/id/draft-xu-idr-fare-01.txt
> Status:   https://datatracker.ietf.org/doc/draft-xu-idr-fare/
> HTMLized: https://datatracker.ietf.org/doc/html/draft-xu-idr-fare
> Diff:     https://author-tools.ietf.org/iddiff?url2=draft-xu-idr-fare-01
> 
> Abstract:
> 
>    Large language models (LLMs) like ChatGPT have become increasingly
>    popular in recent years due to their impressive performance in
>    various natural language processing tasks.  These models are built by
>    training deep neural networks on massive amounts of text data, often
>    consisting of billions or even trillions of parameters.  However, the
>    training process for these models can be extremely resource-
>    intensive, requiring the deployment of thousands or even tens of
>    thousands of GPUs in a single AI training cluster.  Therefore, three-
>    stage or even five-stage CLOS networks are commonly adopted for AI
>    networks.  The non-blocking nature of the network become increasingly
>    critical for large-scale AI models.  Therefore, adaptive routing is
>    necessary to dynamically load balance traffic to the same destination
>    over multiple ECMP paths, based on network capacity and even
>    congestion information along those paths.
> 
> 
> 
> The IETF Secretariat
> 
> 
> _______________________________________________
> Idr mailing list -- idr@ietf.org <mailto:idr@ietf.org>
> To unsubscribe send an email to idr-leave@ietf.org <mailto:idr-leave@ietf.org>