[Idr] Re: Fwd: I-D Action: draft-wang-idr-dpf-00.txt
Robert Raszuk <robert@raszuk.net> Thu, 04 December 2025 22:17 UTC
Return-Path: <robert@raszuk.net>
X-Original-To: idr@mail2.ietf.org
Delivered-To: idr@mail2.ietf.org
Received: from localhost (localhost [127.0.0.1]) by mail2.ietf.org (Postfix) with ESMTP id 9699395AE13C for <idr@mail2.ietf.org>; Thu, 4 Dec 2025 14:17:18 -0800 (PST)
X-Virus-Scanned: amavisd-new at ietf.org
X-Spam-Flag: NO
X-Spam-Score: -1.499
X-Spam-Level:
X-Spam-Status: No, score=-1.499 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, GB_ABOUTYOU=0.5, HTML_MESSAGE=0.001, HTTPS_HTTP_MISMATCH=0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Authentication-Results: mail2.ietf.org (amavisd-new); dkim=pass (2048-bit key) header.d=raszuk.net
Received: from mail2.ietf.org ([166.84.6.31]) by localhost (mail2.ietf.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Kru0qVNB6RJo for <idr@mail2.ietf.org>; Thu, 4 Dec 2025 14:17:17 -0800 (PST)
Received: from mail-ed1-x52d.google.com (mail-ed1-x52d.google.com [IPv6:2a00:1450:4864:20::52d]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by mail2.ietf.org (Postfix) with ESMTPS id 8142195AE0CF for <idr@ietf.org>; Thu, 4 Dec 2025 14:17:12 -0800 (PST)
Received: by mail-ed1-x52d.google.com with SMTP id 4fb4d7f45d1cf-640b06fa959so2508264a12.3 for <idr@ietf.org>; Thu, 04 Dec 2025 14:17:12 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=raszuk.net; s=google; t=1764886631; x=1765491431; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=JcSeBggCPoDLyMc/Q7ZC3+XUg5Yy/8phZmdtZBz8u9A=; b=I6nHbEnSB/rWkcv8db8Ndqgk9hJdgYaunwA2sIRMz7lGrXoxD8cWj50i0/ByBowkoP mGALLwl7Jy8x7MgHIc9h9csZ+IsftMKAQjzmjUlw45pTDgeFD7m5guTF0qTc7lfUQn7e sM924SNuDH4lvGcLu72gOKCBwGkvRLwFs5VeD4VWNsTiFkVUHCnh8PoWsFHQyZdpFGv5 2r8dz65ljtCJ+TqkioUCzysft58Ihl9PU90bIW7wuGldhA4uO3W3QtGBLQbwjju01YpR v8liQQH6DMU2Sq0YgLga4/z0DrlbQ6cJNPZAA7FnUBp68s0+8XrlXoQXfZvJxSitLFqT SqBQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1764886631; x=1765491431; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=JcSeBggCPoDLyMc/Q7ZC3+XUg5Yy/8phZmdtZBz8u9A=; b=ZU0+SUv/Wt6KxSl/jYeGoLgfNHjlhvw/AQEYxhI+J2o1v+zWbL16AwoEJu3wM3aIIE DLAoEl/j3vtko0GjasFyxnbNQfuzkgt+4SqzTmEqtQ+Zj0xy7GS+eFI9zPgD1+ThjnJA CnNWErm0bo+O2zR0jk5Lmv6wgDpp8WPlCYXCvsH+qDoZX6wIFeCS/pPmMs2Vb+oUroyw W4dYouxX0k5eqm8lmEzcAZOMVnpF3vL/dZOZeU6a2z+GECvNf/r18ETeMFmI3XTNtJ47 gdnnqn3Q+OfGAwCV8kRP3EDwCqbVYWQH5aTfe3MMQMSS8TLVfBYmUymxbe8O9Ua0D7Jo Pkrg==
X-Forwarded-Encrypted: i=1; AJvYcCUEEpE0yTDQEfRluXGPPERvXHkl2M97IKt4JrXFanfNr/BsAHZEiA7dT+DIx9HRzvghUjo=@ietf.org
X-Gm-Message-State: AOJu0YwRANiDPpIY/w3hDnRd5gXPdnZELI/PdBHNbWN3U/WvBr2tOfwm dYBAAjo5ujBQsNu9lUwDyMcjtk2nHrMOa6yQpOGvZ1vn8iOno7hyIqXgHdpbj9jlD1o6krcYc8G OuacN4wJoBbkJGYkpXxEdxJzoyW/Drze1XXqqUJ1oJN3XlxOJoLUO
X-Gm-Gg: ASbGncviYVUFASzfz4aJaapZ+r3nsm1D+HVQhnU82KBp6gubrAHZJl8fjdi1+SzW7Be wdXPxMI28EWVT+H/HOmHfDAFvnETw6yQNnMfJ2lpPzKoj0TxTHvZNOrl7fayQKdWfE7XCzjK5w6 +td9B6RdkVxxpzuiuAKZOXudQgB5iMgRSyABgpAGyIacJ/WxDqMkyg6OC8HoA5iTPnbvDynMKiG 7F2AomFePtdet+l0l+Kj37k+HhaiaUGTrBd3uyGl5RqUWB4lz47iM64lSQAnfR5rKodtjsrdKAQ IJgFRg==
X-Google-Smtp-Source: AGHT+IGI37Ulx93w2s4LLZwsALw5lXJ+hzl6IKm5y8N+vrmLWTEuXTxL/ajm+k/OnVXackmEOBi438EIoviSj0dRDmY=
X-Received: by 2002:a05:6402:2105:b0:645:f758:4e1b with SMTP id 4fb4d7f45d1cf-6479c40dc1dmr7131742a12.14.1764886631168; Thu, 04 Dec 2025 14:17:11 -0800 (PST)
MIME-Version: 1.0
References: <176462578612.3650528.8915305565733099516@dt-datatracker-5bd94c585b-wk4l4> <CAOj+MMEw4HFJRmJ_=VhVSQCr1Sic6nrixXqFYpT3E47Mk_EaGw@mail.gmail.com> <CABNhwV2XzaTyiETsYr-STKypq9M49YnW8ekj5jTsG5==xmRX5Q@mail.gmail.com> <MW4PR84MB2092F37923BBE432BF32AECF86D8A@MW4PR84MB2092.NAMPRD84.PROD.OUTLOOK.COM> <CAOj+MMHpDy6zfSjC4nuaVwst+Bj+vcDNjz6CXkO5x6N7-Rkx6w@mail.gmail.com> <MW4PR84MB2092FDB05447EB3962527A5786D9A@MW4PR84MB2092.NAMPRD84.PROD.OUTLOOK.COM> <CAOj+MMEaAJw3Ss8osrRCVheF-k7NL9eA+5KWXeBmk5fpLaujBg@mail.gmail.com> <MW4PR84MB20928FB359828A8DA27A200286A6A@MW4PR84MB2092.NAMPRD84.PROD.OUTLOOK.COM>
In-Reply-To: <MW4PR84MB20928FB359828A8DA27A200286A6A@MW4PR84MB2092.NAMPRD84.PROD.OUTLOOK.COM>
From: Robert Raszuk <robert@raszuk.net>
Date: Thu, 04 Dec 2025 23:17:00 +0100
X-Gm-Features: AWmQ_bmpqGDp5Hnlbr1WzfqR246fXu1vLnTVpO3f7JWG_929on7TQmv7No_t6pc
Message-ID: <CAOj+MMH=F-p4LPTas0UeB_fuE6TAhKvcjmUYgFEhv_tpqdOeBg@mail.gmail.com>
To: "Wang, Kevin" <kevin.wang@hpe.com>
Content-Type: multipart/alternative; boundary="000000000000b84930064527b0f6"
Message-ID-Hash: YPUQXYDRWHG2CPTSZWU4RQCLWILCEZB7
X-Message-ID-Hash: YPUQXYDRWHG2CPTSZWU4RQCLWILCEZB7
X-MailFrom: robert@raszuk.net
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-idr.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: "idr@ietf. org" <idr@ietf.org>, lsr <lsr@ietf.org>
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [Idr] Re: Fwd: I-D Action: draft-wang-idr-dpf-00.txt
List-Id: Inter-Domain Routing <idr.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/diVwKzL2U6dHAAfcuUyRozTqeJI>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Owner: <mailto:idr-owner@ietf.org>
List-Post: <mailto:idr@ietf.org>
List-Subscribe: <mailto:idr-join@ietf.org>
List-Unsubscribe: <mailto:idr-leave@ietf.org>
Hi Kevin, I am sceptical if the proposed BGP extension is desired in BGP protocol. But this is just my own opinion and as Jeff says I can be in "rough" on it. But reading on your proposal I do think that marking this coloring on a per BGP session basis (strict or loose) is a very bad idea. We have departed from any per session marking when MP BGP Extensions have been introduced. So if you want to continue I recommend a much more granular capability of coloring. Ideally on a per NLRI/UPDATE MSG basis. With your current proposal you have created a physical partitioning not logical one. Also can you elaborate in your draft (keeping in mind BGP native recursiveness) why BGP CAR or BGP CT proposals fail to address your objectives ? Are they broken and need fixing or you just prefer to start fresh with yet one more way to achieve the same ? Thank you, R. On Thu, Dec 4, 2025 at 10:18 PM Wang, Kevin <kevin.wang@hpe.com> wrote: > Hi Robert, > > Unless we have perfect load balancing, congestion is always possible, even > in a non-blocking Clos fabric. Also, there are other scenarios where > avoiding fate-sharing paths is crucial. > > Thanks, > Kevin > > *From: *Robert Raszuk <robert@raszuk.net> > *Date: *Wednesday, December 3, 2025 at 3:59 PM > *To: *Wang, Kevin <kevin.wang@hpe.com> > *Cc: *Gyan Mishra <hayabusagsm@gmail.com>, idr@ietf. org <idr@ietf.org>, > lsr <lsr@ietf.org> > *Subject: *Re: [Idr] Re: Fwd: I-D Action: draft-wang-idr-dpf-00.txt > > Hi Kevin, > > Your draft explains how to do poor man's flex algo in BGP - ok. > > But could you elaborate why anyone would do that (and push more > complexity) in a non-blocking CLOS fabric ? > > Cheers, > R. > > > > On Wed, Dec 3, 2025 at 7:35 PM Wang, Kevin <kevin.wang@hpe.com> wrote: > > Hi Robert, > > Thank you for providing further details about your thoughts. What I heard > that IGP was not initially adopted in DC fabrics was due to its scaling > issues (mostly due to lsdb flooding), especially for the hyperscalers. I > understand that there were efforts later trying to address the scaling > issues from IGP side. I see your experience of using ISIS to successfully > construct the fabric as a good example. Yes, it might be worth to write an > ISIS for DC fabrics informational RFC, serving as an alternative to RFC > 7938. There are also other efforts trying to bring traffic engineering > technologies, such as RSVP, MPTE, etc to the DC fabrics. Like any other > networks, the DC fabrics will probably also evolve over time. > > Having said that, most of today’s DC fabrics (at least for those DC > customers I have dealt with) are designed following RFC 7938: > > - Use Clos topology > - Use IP forwarding > - Use EBGP as the underlay routing protocol > > I guess the choices above are for technical reasons as well as business > reasons. BGP DPF is developed under the assumptions/observations above. I > agree that the DC fabrics might evolve and adopt other technologies such as > IGP, RSVP, in the future. For the time being and the foreseeable future, > BGP DPF would help to provide a lightweight traffic engineering for the DC > fabrics. > > Thanks, > Kevin > > *From: *Robert Raszuk <robert@raszuk.net> > *Date: *Tuesday, December 2, 2025 at 2:46 PM > *To: *Wang, Kevin <kevin.wang@hpe.com> > *Cc: *Gyan Mishra <hayabusagsm@gmail.com>, idr@ietf. org <idr@ietf.org>, > lsr <lsr@ietf.org> > *Subject: *Re: [Idr] Re: Fwd: I-D Action: draft-wang-idr-dpf-00.txt > > Dear Kevin, > > I know very well what RFC 7938 says. In fact I did review this document > well before it became an RFC :) > > But what happened next is that while RFC7938 make a valid observation on > how one can build MSDCs lots of folks misinterpreted it as the only guide > on how to build even a few racks of DC fabrics. > > So yes, using BGP to construct dynamic routing in the DC fabrics has its > use cases that are really applicable to only a handful of deployments. And > I am not aware that any of the MSDCs would be asking you for logical > transport planes within their fabrics. > > All other DCs would be much better off using IGP for underlay and BGP for > overlay as a design pattern. > > When I constructed 10 full racks of hardware using ISIS folks were shocked > - and pointed out that I am not using an IETF standard approach :). Then > when I demonstrated that connectivity restoration upon any node or link > failure is repaired in less then 50 ms the masks went off. > > Maybe what is actually needed is an informational RFC - just like RFC7938 > - simply illustrating that one can construct DC using ISIS. It is obvious > to me, but I admit there is no RFC I am aware of to show operators that > "Large-Scale Data Centers" can be robustly build with IGPs. > > Kind regards, > Robert > > > On Tue, Dec 2, 2025 at 7:24 PM Wang, Kevin <kevin.wang@hpe.com> wrote: > > Hi Robert and Gyan, > > Thanks for your feedback! Your observation is correct that IGP Flex Algo > could achieve the same. BGP DPF can be though as a BGP counterpart of IGP > Flex Algo to some extent (though not precisely). > > As explained in the “Introduction” section of this draft, BGP DPF is > designed for the current IP fabric environment where EBGP is usually the > only protocol used for routing. Section 5 of RFC 7938 explains why DC > fabrics use EBGP as the sole routing protocol. > > Thanks, > Kevin > > *From: *Gyan Mishra <hayabusagsm@gmail.com> > *Date: *Tuesday, December 2, 2025 at 7:43 AM > *To: *Robert Raszuk <robert@raszuk.net> > *Cc: *idr@ietf. org <idr@ietf.org>, lsr <lsr@ietf.org> > *Subject: *[Idr] Re: Fwd: I-D Action: draft-wang-idr-dpf-00.txt > > I agree with Robert that you could use RFC 9502 IGP Flex Algo in IP > networks to build disjoint planes as desired. > > You could also use SRv6 with IGP Flex Algo with SR RFC 9350 which uses > IPv6 data plane and build your disjoint planes. > > Thanks > > Gyan > > On Tue, Dec 2, 2025 at 6:32 AM Robert Raszuk <robert@raszuk.net> wrote: > > Hi, > > In respect to the subject draft ... why would you not use IGP Flexible > Algorithm for it ? > > Are you going to port now years of work from IGP to BGP to achieve the > same ? > > Besides, in a non-blocking fabric latency is really not a factor. So you > want to logically partition it to make it blocking them worry about what > travels on which such logical plane ? Is this a reasonable direction ? > > Thx, > R. > > ---------- Forwarded message --------- > From: <internet-drafts@ietf.org> > Date: Mon, Dec 1, 2025 at 10:49 PM > Subject: I-D Action: draft-wang-idr-dpf-00.txt > To: <i-d-announce@ietf.org> > > > Internet-Draft draft-wang-idr-dpf-00.txt is now available. > > Title: BGP Deterministic Path Forwarding (DPF) > Authors: Kevin Wang > Michal Styszynski > Wen Lin > Mahesh Subramaniam > Thomas Kampa > Diptanshu Singh > Name: draft-wang-idr-dpf-00.txt > Pages: 18 > Dates: 2025-12-01 > > Abstract: > > Modern data center (DC) fabrics typically employ Clos topologies with > External BGP (EBGP) for plain IPv4/IPv6 routing. While hop-by-hop > EBGP routing is simple and scalable, it provides only a single best- > effort forwarding service for all types of traffic. This single > best-effort service might be insufficient for increasingly diverse > traffic requirements in modern DC environments. For example, loss > and latency sensitive AI/ML flows may demand stronger Service Level > Agreements (SLA) than general purpose traffic. Duplication schemes > which are standardized through protocols such as Parallel Redundancy > Protocol (PRP) require disjoint forwarding paths to avoid single > points of failure. Congestion avoidance may require more > deterministic forwarding behavior. > > This document introduces BGP Deterministic Path Forwarding (DPF), a > mechanism that partitions the physical fabric into multiple logical > fabrics. Flows can be mapped to different logical fabrics based on > their specific requirements, enabling deterministic forwarding > behavior within the data center. > > The IETF datatracker status page for this Internet-Draft is: > https://datatracker.ietf.org/doc/draft-wang-idr-dpf/ > <https://urldefense.com/v3/__https://datatracker.ietf.org/doc/draft-wang-idr-dpf/__;!!NEt6yMaO-gk!EP_lEYmqbOUApQqqOz-ZuP9CsojS2gbvLvgQfxoYTXPXtS-0yjfv8ElqZwJBCRfOLFY6nymWoR5eJlshPeG9$> > > There is also an HTML version available at: > https://www.ietf.org/archive/id/draft-wang-idr-dpf-00.html > <https://urldefense.com/v3/__https://www.ietf.org/archive/id/draft-wang-idr-dpf-00.html__;!!NEt6yMaO-gk!EP_lEYmqbOUApQqqOz-ZuP9CsojS2gbvLvgQfxoYTXPXtS-0yjfv8ElqZwJBCRfOLFY6nymWoR5eJjgsy_TY$> > > Internet-Drafts are also available by rsync at: > rsync.ietf.org::internet-drafts > > > _______________________________________________ > I-D-Announce mailing list -- i-d-announce@ietf.org > To unsubscribe send an email to i-d-announce-leave@ietf.org > _______________________________________________ > Idr mailing list -- idr@ietf.org > To unsubscribe send an email to idr-leave@ietf.org > >
- [Idr] Fwd: I-D Action: draft-wang-idr-dpf-00.txt Robert Raszuk
- [Idr] Re: Fwd: I-D Action: draft-wang-idr-dpf-00.… Gyan Mishra
- [Idr] Re: Fwd: I-D Action: draft-wang-idr-dpf-00.… Wang, Kevin
- [Idr] Re: Fwd: I-D Action: draft-wang-idr-dpf-00.… Robert Raszuk
- [Idr] Re: Fwd: I-D Action: draft-wang-idr-dpf-00.… Wang, Kevin
- [Idr] Re: Fwd: I-D Action: draft-wang-idr-dpf-00.… Robert Raszuk
- [Idr] Re: Fwd: I-D Action: draft-wang-idr-dpf-00.… Wang, Kevin
- [Idr] Re: Fwd: I-D Action: draft-wang-idr-dpf-00.… Robert Raszuk
- [Idr] Re: [Lsr] Re: Re: Fwd: I-D Action: draft-wa… Tony Przygienda
- [Idr] Re: I-D Action: draft-wang-idr-dpf-00.txt Jeffrey Haas
- [Idr] Re: I-D Action: draft-wang-idr-dpf-00.txt Jeffrey Haas
- [Idr] Re: I-D Action: draft-wang-idr-dpf-00.txt Robert Raszuk
- [Idr] Re: I-D Action: draft-wang-idr-dpf-00.txt Jeffrey Haas
- [Idr] Re: I-D Action: draft-wang-idr-dpf-00.txt Robert Raszuk
- [Idr] Re: I-D Action: draft-wang-idr-dpf-00.txt Wang, Kevin
- [Idr] Re: I-D Action: draft-wang-idr-dpf-00.txt Robert Raszuk
- [Idr] Re: I-D Action: draft-wang-idr-dpf-00.txt Kevin Wang