Q on the congestion awareness of routing protocols
Toerless Eckert <tte@cs.fau.de> Fri, 02 December 2022 17:03 UTC
Return-Path: <eckert@i4.informatik.uni-erlangen.de>
X-Original-To: tsv-area@ietfa.amsl.com
Delivered-To: tsv-area@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9FF77C14CF02; Fri, 2 Dec 2022 09:03:26 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.649
X-Spam-Level:
X-Spam-Status: No, score=-1.649 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=no autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id OYsx4BZsS7es; Fri, 2 Dec 2022 09:03:22 -0800 (PST)
Received: from faui40.informatik.uni-erlangen.de (faui40.informatik.uni-erlangen.de [131.188.34.40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D6EC6C14CEE4; Fri, 2 Dec 2022 09:03:11 -0800 (PST)
Received: from faui48e.informatik.uni-erlangen.de (faui48e.informatik.uni-erlangen.de [IPv6:2001:638:a000:4134::ffff:51]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by faui40.informatik.uni-erlangen.de (Postfix) with ESMTPS id 16EE45486A6; Fri, 2 Dec 2022 18:03:06 +0100 (CET)
Received: by faui48e.informatik.uni-erlangen.de (Postfix, from userid 10463) id 034A04EC1BC; Fri, 2 Dec 2022 18:03:05 +0100 (CET)
Date: Fri, 02 Dec 2022 18:03:05 +0100
From: Toerless Eckert <tte@cs.fau.de>
To: routing-discussion@ietf.org, tsv-area@ietf.org
Cc: pim@ietf.org, bier@ietf.org
Subject: Q on the congestion awareness of routing protocols
Message-ID: <Y4ovyV074qa3gLSu@faui48e.informatik.uni-erlangen.de>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
Archived-At: <https://mailarchive.ietf.org/arch/msg/tsv-area/HebHF_6dsEKdPFOa62_bMp0LDpg>
X-BeenThere: tsv-area@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: IETF Transport and Services Area Mailing List <tsv-area.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/tsv-area>, <mailto:tsv-area-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/tsv-area/>
List-Post: <mailto:tsv-area@ietf.org>
List-Help: <mailto:tsv-area-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/tsv-area>, <mailto:tsv-area-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 02 Dec 2022 17:03:26 -0000
Dear routing-discussion / TSV folks (sorry for escalating this, but it really bugs me - Cc'ing PIM/BIER) What are these days the expectations against let's say a full Internet Standard for a routing protocol to support in terms of congestion safe behavior ? And what are congestion control expectation for new routing protocl RFCs even if just proposed standard ? I am asking, because i think that our core IP multicast routing protocol fails miserably on this end, and quite frankly i do not understand how PIM-SM (RFC7761) could have become a full Internet standard given how it has zilch discussion about congestion or loss handling. [ Especially, when in comparison a protocol like RFC7450 where TSV did raise concerns about multicast data plane congestion awareness, and it was held up for years, and GregS as the WG-chair for the WG responsible for RFC7450 had to even help co-author RFC8085 to cut through the congestion control concern-cord. But likely all for the better!]. To quickly summarize the issue with PIM-SM to those who do not know it: /- R2 -------- R6 -\ Rcvrs ... R1 R7 ... Senders \- R3 -- R4 -- R5 -/ CE ... PE .. P P P PE CE ... R1 has let's say 100,000 ulticast/PIM (S,G) states with sources behind R7, so it has to maintain 1000,000 so-called PIM (S,G) joins across the path R2, R6, R7. Lets say roughly an (S,G) join for IPv6 is about 38 byte (IPv6), maybe 35 (S,G) per 1500 byte packet, so 2857 packets of 1500 byte to carry all 100,000 (S,G). Assume link R6/R7 fails, IGP reconverges, R1 recognizes that it needs to change path, so it sends 2857 PIM-SM packets with prunes to R2 and 2857 PIM -SM packets with joins to R3. Assume R1 is a PE, R2 and R3 are P routers in an SP, and actually R2/R3 connect to lets say 100 routers like R1. Now R2 and R3 get 100 x 2857 1500 byte packets. And there is nothing in the PIM-SM spec that talks about how to throttle this heap of PIM-SM packets. Typically, routers would just send them back-to-back. And those packets repeat every 60 seconds given how PIM-SM is datagram / periodic soft-state. In fact, if you try to scale this in production networks, you will most likely fail a lot more than IP multicast in those routers, because PIM not only will badly compete on control-plane CPU time, but even more so on control-plane to hardware-forwarding time when updating the 100,000 (S,G) hardware forwarding entries. Correct me if i am wrong, but did the same type of issues in ISIS/OSPF in DC because of so many parallel paths and hence duplication of LSA recently lead to the creation of multiple IETF working groups in RTG to solve these issues ? In IP multicast, we where well aware of these issues and they where a core reason to not build a PIM-based MPLS multicast protocol, but use the TCP based LDP to specify mLDP (RFC6388). Same thing, when various BGP multicast work was done as an alternative to PIM for SPs (BCP also being TCP based). We did even fix this problem in PIM by specifying RFC6559 (PIM over TCP), but instead of making that mechanisms mandatory and become the only option for PIM when moving PIM up the IETF standards ladder to RFC7761, that RFC had seemingly fallen into ignorance in the IP Multicast community, because most IP multicast deployments are small enough that these issues do not occur. So, why do i escalate this issue now ? We have a great new multicast architecture called BIER that eliminates all this PIM multicast state issues from the P routers of such large service provider networks by being stateless. But it still leaves the need for overlay signaling, such as with PIM to operate between the PE, such as in above picture the hundreds if not thousands of receiver PE R1' and sender PE R7'. In which case you would have PIM directly between those R1'/R7' across multihop paths, leading to even more congestion considerations. And in support of such BIER networks, there is a draft draft-hb-pim-light proposed to PIM-WG to optimize PIM explicitly for this type of deployment. And when i said in PIM@IETF115, that such a draft IMHO should only allowed to proceed when it is written to say it MUST be based on PIM over TCP (RFC6388), all other people responding on the thread said at best it could be be a MAY. Aka: Congestion control optional. Am i a congestion control extremist ? I really only want to have scaleable, reliably multicast RFCs, especially when they aspire and go to full IETF standard and are meant to support our next-gen IP Multicast architectures (BIER). I do fully understand how there is a lot of cost pressure on vendor development, and having procrastinated to implement, proliferate and deploy PIM over TCP so far (almost a decade!) does make this a less attractive choice short term. And the whole purpose of the PIM light draft of course is to reduce the amount of development needed by making PIM more "light" (which is a good think). But when it carries forward the problems of PIM to another generation of networks (using BIER) that was especially built to scale better, then one should IMHO really become worried. At least i do. But i also struggled to implement datagram PIM processing for 100,000 states in a prior life and then pushed for PIM over TCP... Thanks! Toerless
- Q on the congestion awareness of routing protocols Toerless Eckert
- Re: [pim] Q on the congestion awareness of routin… Greg Shepherd
- Re: Q on the congestion awareness of routing prot… Jon Crowcroft
- Re: Q on the congestion awareness of routing prot… Matt Mathis
- Re: [pim] Q on the congestion awareness of routin… Toerless Eckert
- Re: Q on the congestion awareness of routing prot… Toerless Eckert
- Re: Q on the congestion awareness of routing prot… Toerless Eckert
- Re: Q on the congestion awareness of routing prot… Masataka Ohta
- Re: Q on the congestion awareness of routing prot… Jon Crowcroft
- Re: Q on the congestion awareness of routing prot… Stewart Bryant
- Re: Q on the congestion awareness of routing prot… Masataka Ohta
- Re: Q on the congestion awareness of routing prot… Jon Crowcroft
- Re: Q on the congestion awareness of routing prot… Masataka Ohta
- Re: Q on the congestion awareness of routing prot… Jon Crowcroft
- Re: Q on the congestion awareness of routing prot… Bless, Roland (TM)
- Re: Q on the congestion awareness of routing prot… Carsten Bormann
- Re: Q on the congestion awareness of routing prot… Robert Raszuk
- Re: Q on the congestion awareness of routing prot… Curtis Villamizar
- Re: Q on the congestion awareness of routing prot… Masataka Ohta
- Re: Q on the congestion awareness of routing prot… Bless, Roland (TM)
- Re: [pim] Q on the congestion awareness of routin… Dino Farinacci
- Re: [pim] Q on the congestion awareness of routin… Dino Farinacci
- Re: Q on the congestion awareness of routing prot… Toerless Eckert
- Re: Q on the congestion awareness of routing prot… Matt Mathis
- RE: [Bier] [pim] Q on the congestion awareness of… Dirk Trossen
- Re: [pim] Q on the congestion awareness of routin… Jon Crowcroft
- Re: [pim] Q on the congestion awareness of routin… Masataka Ohta
- Re: [pim] Q on the congestion awareness of routin… Michael Welzl
- Re: [pim] Q on the congestion awareness of routin… Jonathan Morton
- Re: [Bier] [pim] Q on the congestion awareness of… Robert Raszuk
- OT and trimmed (was Re: Q on the congestion aware… Curtis Villamizar
- Re: [Bier] [pim] Q on the congestion awareness of… Luigi Iannone
- Re: [pim] Q on the congestion awareness of routin… Dino Farinacci
- Re: [pim] Q on the congestion awareness of routin… Masataka Ohta
- Re: [Bier] [pim] Q on the congestion awareness of… Masataka Ohta
- Re: [Bier] [pim] Q on the congestion awareness of… Robert Raszuk
- Re: [Bier] [pim] Q on the congestion awareness of… Masataka Ohta
- RRG thoughts (was [Bier] [pim] Q on the congestio… Toerless Eckert
- Re: Q on the congestion awareness of routing prot… Toerless Eckert
- Re: RRG thoughts (was [Bier] [pim] Q on the conge… Luigi Iannone
- Re: [pim] Q on the congestion awareness of routin… Toerless Eckert
- Re: RRG thoughts (was [Bier] [pim] Q on the conge… Abdussalam Baryun
- RE: [pim] Q on the congestion awareness of routin… Jeffrey (Zhaohui) Zhang
- Re: [pim] Q on the congestion awareness of routin… Jonathan Morton
- Re: [pim] Q on the congestion awareness of routin… Jon Crowcroft
- Re: [pim] Q on the congestion awareness of routin… Michael Richardson
- Re: [pim] Q on the congestion awareness of routin… Jon Crowcroft
- Re: [pim] Q on the congestion awareness of routin… Adamson, Robert B CIV USN NRL (5592) Washington DC (USA)
- Re: [Bier] [pim] Q on the congestion awareness of… Toerless Eckert
- RE: [Bier] [pim] Q on the congestion awareness of… Jeffrey (Zhaohui) Zhang