Re: [Bier] PIM/BIER: Where and how to do structure-encoded multicast distribution tree forwarding work ?

Toerless Eckert <tte@cs.fau.de> Thu, 03 August 2023 21:52 UTC

Return-Path: <eckert@i4.informatik.uni-erlangen.de>
X-Original-To: bier@ietfa.amsl.com
Delivered-To: bier@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id ADC88C15155A; Thu, 3 Aug 2023 14:52:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.66
X-Spam-Level:
X-Spam-Status: No, score=-6.66 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.249, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id f1TX8bEPxkaW; Thu, 3 Aug 2023 14:52:04 -0700 (PDT)
Received: from faui40.informatik.uni-erlangen.de (faui40.informatik.uni-erlangen.de [IPv6:2001:638:a000:4134::ffff:40]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 558ADC15152E; Thu, 3 Aug 2023 14:52:01 -0700 (PDT)
Received: from faui48e.informatik.uni-erlangen.de (faui48e.informatik.uni-erlangen.de [131.188.34.51]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by faui40.informatik.uni-erlangen.de (Postfix) with ESMTPS id 4RH2cW6k8CznkVp; Thu, 3 Aug 2023 23:51:55 +0200 (CEST)
Received: by faui48e.informatik.uni-erlangen.de (Postfix, from userid 10463) id 4RH2cW69Xszkwcn; Thu, 3 Aug 2023 23:51:55 +0200 (CEST)
Date: Thu, 03 Aug 2023 23:51:55 +0200
From: Toerless Eckert <tte@cs.fau.de>
To: pim@ietf.org, bier@ietf.org, msr6@ietf.org
Message-ID: <ZMwhewRUm12vP6Xb@faui48e.informatik.uni-erlangen.de>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
Archived-At: <https://mailarchive.ietf.org/arch/msg/bier/Y-dUu-fenwKPzfNov615pkQeCWk>
Subject: Re: [Bier] PIM/BIER: Where and how to do structure-encoded multicast distribution tree forwarding work ?
X-BeenThere: bier@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "\"Bit Indexed Explicit Replication discussion list\"" <bier.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bier>, <mailto:bier-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bier/>
List-Post: <mailto:bier@ietf.org>
List-Help: <mailto:bier-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bier>, <mailto:bier-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 03 Aug 2023 21:52:05 -0000

Dear PIM-WG, BIER-WG and msr6 community:

I am writing this in response to <CA+wi2hNMmeTefXVYJxmCT8mRgmp1EfZr+zRsWbadb5Bw37rnpA@mail.gmail.com>
on BIER WG form TonyP, which i think is jumping the gun on something i wish we had more time to have technical
and process discussions about, and alas, this could not happen at IETF117 for reasons
explained further below.

Ultimately, IMHO, the question is how we best should do what i consider should be a single next-gen
stateless multicast forwarding given the structure of our existing working groups PIM/BIER and
how they may require us to split up work in the middle.

Let me try to explain the core technical issue:

We are working since early 2023 on P4/Tofino validation code for our concepts
of how to do stateless IP Multicast forwarding in a scalable, high-speed, low-cost fashion
applicable to hopefully a broad set of vendor hardware (hence validation on Tofino - lowest
common denominator - may not even do everything we want, but useful to gauge the feasibility
on better hardware). Our initial proposal presented for IETF discussions since end of 2021
evolved from draft-xu-msr6-rbs to draft-eckert-bier-rbs, with also variations
in draft-eckert-msr6-rbs. 

Note that there also is a vendor NPU prototype of the original version described in draft-xu-msr6,
which was validatet with a high-speed NPU on a research WAN in China. We recently had the research report
translated, see here:

https://github.com/network2030/publications/blob/main/CENI_Carrier_Grade_Minimalist_Multicast_Networking_Test_Report.pdf

In short, the forwarding method is to perform scalable, high-speed source routed, stateless multicast
by encoding the multicast tree in a source routing header as an easy to parse
compact structure.

This is of course next-gen work based on experience with BIER and especially BIER-TE (RFC9262),
both of which use what i call "network wide flat bitstrings" and their scaling and implementation costs/challenges.
In these "RBS" encoding structures as we call them, per-tree-node bitstrings are used to indicate next hops in the tree,
so for exampe each router only needs to parse a single (short) bitstring for its next-hops, and ignore the rest of the
encoded structure.

However, we also had started to work on scalability simulations and comparison
of different approaches. One, not finalized, but likely conclusion is that a solution
best should have both the option to address next-hops via either a bitstring or
a sequence of next-hop identifiers. Take for example the two RFC9262 Section 2.2 examples
and think that the first example might be best encoded with next-hop bitstrings,and example 2
with next-hop identifiers. In general: the more sparse the tree, the more likely per-next-hop
identifiers may be better, the denser the tree in a particular area of the network, the
more likely bitstrings as next-hops are better.

When i talked about the general problem, TonyP said on the mike at BIER-WG IETF116, 
that a solution using bitstrings could go into BIER, a solution with identifier would need to
go to a different WG, such as SPRING (where PIM WG seems to be doing multicast work with
SPRING architecture).

Hence we have the potential problem between our current IETF WGs  that we may
not work in a single WG on a solution with two options, even if we would technically agree
we need both and they would just be a variation in an otherwise common encoding mechanism.

I would find such an outcome technically quite difficult and wish we would be able to
make the scope of the solution tackled by a single working group (current or future) broad
enough that all useful technical options are on the table for it.

We did plan to present and disucss this at IETF117 in PIM-WG because of exactly this
described issue, and i had also sent an email about exactly this before IETF117 to the PIM WG,
PIM chairs, and also in private email to the BIER WG chairs.

Unfortunately, the PIM WG chairs overlooked the slot ask, and when this was discovered, the
PIM WG agenda for IETF117 was already full. Likewise, the BIER chairs did not reply to my email at all,
and i had a slot conflict so i could not go to BIER-WG meeting this time. Note that also several
other active participants on the topic could not come, for cost of getting to USA and/or for USA failing
to issue visas to them.

So, i think the list here should be used to discuss,
and/or IETF in Prague would, we hope, be a much better place to have a more complete showing of the
community and in-person discussion.

Cheers
   Toerless

P.S.: There are already other proposals with a single tree encoding supporting both
next-hop bitstring and next-hop identifier encoding of tree sructures, such as
draft-chen-pim-adaptive-te, which is AFAIK a variation of what was earlier presented at 
the IETF114 msr6 BOF, but i am not aware whether this line of proposals had similar
vendor and whitebox NPU/PoC validation or simulation of comparison of scaling aspects
as what we are doing for our proposals to feel solid about our conclusions.