Re: [pim] pim wg adoption call for draft-chen-pim-srv6-p2mp-path-01

Stig Venaas <stig@venaas.com> Thu, 28 January 2021 20:19 UTC

Return-Path: <stig@venaas.com>
X-Original-To: pim@ietfa.amsl.com
Delivered-To: pim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CBC6E3A16EC for <pim@ietfa.amsl.com>; Thu, 28 Jan 2021 12:19:43 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.885
X-Spam-Level:
X-Spam-Status: No, score=-1.885 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_REMOTE_IMAGE=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=venaas-com.20150623.gappssmtp.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tkqppbq5cOsr for <pim@ietfa.amsl.com>; Thu, 28 Jan 2021 12:19:40 -0800 (PST)
Received: from mail-il1-x133.google.com (mail-il1-x133.google.com [IPv6:2607:f8b0:4864:20::133]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 01D8C3A16F1 for <pim@ietf.org>; Thu, 28 Jan 2021 12:19:39 -0800 (PST)
Received: by mail-il1-x133.google.com with SMTP id q9so6460037ilo.1 for <pim@ietf.org>; Thu, 28 Jan 2021 12:19:39 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=venaas-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to; bh=YzwyaFyjOCEW5M/gSKISmQDolD+coB8GiEK6sGldXNA=; b=OHwICBVTG3KsaVA/7UyyDNPHhcUam3ju0e1J1QgxU6ewTI0bDVRp4+yuz4F7A1ydcG JmHxiRf3TrqeZo3C/2KwSeHM0vJaO7+xsCe7U+rKWyPOXTSksUOL6KWg/l4iVqASh3y4 lrciTf/1vI0tIPbqQMkUx5kEHkPi5zYaNqUQz/vs8stoE5kahwIPrPEvq4OQuFRux77X 1Bswp+Jyp1h+RJ+VKaBNIyY9cItk3qScLMVzC/ByjaH2VX4PMbO/ttIK+cvstAqkVlJy wKQmXqjGVk0qxe3zQRKXWaAzrSSo1+tfDepfNRASRpMznnprvx7D0fED69q7YxTw5aq0 /42A==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to; bh=YzwyaFyjOCEW5M/gSKISmQDolD+coB8GiEK6sGldXNA=; b=FLuXoqPNzjNQhEqwSC4BuyPvptc3lqzKMhEwG3qnLXqW4PAg1R7BgyixwFz3DP5uZt lez+gnaGVckBrEJtEsOiZMCtRRiWxWiqubheO02Rbpsnl8/0rHNBmMmo5c0s1Y3ETU/Q h/vFm8Jlq/oova7yv374iCN7sIqyesLZ/D/J6tpyaKCeCwr7b0zLv8FjdHZy5EK3Z6MA OoSInXVfCVW6xNJWT5NpTuPKV7RWUcLJjDELSHx+/xJMWsiwzsEq9qN8gE4nWSErv9V6 D8z4gJow99PxFzXtGUbOIyMBYSNqIc3+N2h7iGTxlVf7ydtQhmzrnokRHquMmQxXbfml lbdA==
X-Gm-Message-State: AOAM530z0LkOrMSquw9CbdCU5gRdTXmdrYsYVZuo+OxWOaW1IWHGKcO6 ohDiJwu1eqSHjuXz2YFbv70h7W2C3Ubtq7ZfiCG56Q==
X-Google-Smtp-Source: ABdhPJwZRj9/V4RKSXcn7AOtqW5oy83YdK9w0jeWWkGTw87ffS4mgdnTB4J2Xu/BX/kuJh2W9FNJD+nFf1j8+2jTvV8=
X-Received: by 2002:a92:8e4b:: with SMTP id k11mr635025ilh.192.1611865178695; Thu, 28 Jan 2021 12:19:38 -0800 (PST)
MIME-Version: 1.0
References: <CAHANBtLs2F+x9ny8Jv=qe-28dFubcQL=k8bYXO4sybBr_Zpe5Q@mail.gmail.com> <DM6PR08MB397800060F4718FAAC2D947C91AA0@DM6PR08MB3978.namprd08.prod.outlook.com> <MN2PR05MB598196F87D3F4764EE13506AD4AA0@MN2PR05MB5981.namprd05.prod.outlook.com> <MN2PR13MB40879EEF5991503823F09D31F2A69@MN2PR13MB4087.namprd13.prod.outlook.com> <MN2PR05MB598150F8FA540A0426F6F8AED4A39@MN2PR05MB5981.namprd05.prod.outlook.com> <MN2PR13MB4087C8ED6C857764A1A13821F2A09@MN2PR13MB4087.namprd13.prod.outlook.com> <MN2PR05MB59814F815F09A61B8CDA32E7D4BD9@MN2PR05MB5981.namprd05.prod.outlook.com> <MN2PR13MB40870CD18814B56E04DD0520F2BD9@MN2PR13MB4087.namprd13.prod.outlook.com> <MN2PR05MB59815F695AD065F5C32AB1EDD4BC9@MN2PR05MB5981.namprd05.prod.outlook.com> <MN2PR13MB4087E8E538DDD5FC6F8C6443F2BB9@MN2PR13MB4087.namprd13.prod.outlook.com> <MN2PR05MB598123E73199AB595782A0D8D4BA9@MN2PR05MB5981.namprd05.prod.outlook.com> <CABNhwV2CQ=2GQgpeO5bVeUjhjO31rr9D7J=a=3QNbLfj90GGqw@mail.gmail.com> <C445E577-5B72-4EC9-A9F5-1E596558F1F1@cisco.com>
In-Reply-To: <C445E577-5B72-4EC9-A9F5-1E596558F1F1@cisco.com>
From: Stig Venaas <stig@venaas.com>
Date: Thu, 28 Jan 2021 12:19:27 -0800
Message-ID: <CAHANBtLPdseN7mU04Cy7sp-th84Lgz5HrYFFFzRh1w9_0BbLCQ@mail.gmail.com>
To: draft-chen-pim-srv6-p2mp-path@ietf.org, pim@ietf.org
Content-Type: multipart/alternative; boundary="0000000000006701b505b9fb9b0d"
Archived-At: <https://mailarchive.ietf.org/arch/msg/pim/Q2nUc9PdzftQi92C7VbF65IbjC8>
Subject: Re: [pim] pim wg adoption call for draft-chen-pim-srv6-p2mp-path-01
X-BeenThere: pim@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Protocol Independent Multicast <pim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/pim>, <mailto:pim-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/pim/>
List-Post: <mailto:pim@ietf.org>
List-Help: <mailto:pim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/pim>, <mailto:pim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Jan 2021 20:19:44 -0000

Thanks for great discussion, please keep it going. It is refreshing to have
technical discussions on the list.

The adoption deadline has ended, and at least for now, there is not a rough
consensus for adoption.

Please continue the discussion on the pim wg list, and it may be worth
discussing in spring as well. We will reconsider adoption if based on
further discussion and draft revisions, there may be a rough consensus. The
key thing is to consider why several people are currently opposed to
adoption, and see if any of the issues raised can be addressed.

It is also possible that some of the work should be done in spring, or at a
minimum that we get input from spring, so authors, please try to get input
on the draft there.

Regards,
Stig




On Thu, Jan 28, 2021 at 7:45 AM Acee Lindem (acee) <acee=
40cisco.com@dmarc.ietf.org> wrote:

> Hi Gyan,
>
>
>
> *From: *pim <pim-bounces@ietf.org> on behalf of Gyan Mishra <
> hayabusagsm@gmail.com>
> *Date: *Thursday, January 28, 2021 at 8:48 AM
> *To: *"Jeffrey (Zhaohui) Zhang" <zzhang=40juniper.net@dmarc.ietf.org>
> *Cc: *Toerless Eckert <tte@cs.fau.de>, "pim@ietf.org" <pim@ietf.org>
> *Subject: *Re: [pim] pim wg adoption call for
> draft-chen-pim-srv6-p2mp-path-01
>
>
>
> All,
>
>
>
> Huaimo has provided some valuable responses to scaling issues.  One point
> to note as far as scaling is that most all provider run Jumbo frames 9216
> MTU and IPv4 length is 16 bits which can support up to 65,535 size packet.
>
>
>
> IPv6 header natively supports a 16 bit header length which is the same as
> IPv4.
>
>
>
> RFC 2765 specifies a hbh options Jumbo payload option that carried a 32
> bit length field 65,536 and 4,294,967,295 octets in length. Packets with
> such long payloads are referred to as
>
>    "jumbograms".
>
>
>
> All router and switch vendors today support super jumbo which is payload
> length between 9000 and 65535 capping today at 9216.
>
>
>
> In the future as memory and buffer sizes increase in hardware fixed
> function ASIC FPGA or programmable NPU  for packet IO processing the sizes
> will eventually jump to 65535.  The next big step will be jumbo grams.  šŸ˜€
>
>
>
> With SRv6 unlike SR-MPLS traditional MPLS based PTA cannot be leveraged
> such as RAVP-TE or mLDP, so the options for P2MP MDT is very limited and If
> you donā€™t want the source node overhead of IR or SR replication SID, you
> just have BIER as your only option.
>
>
>
> Operators love having options and this draft provisss a valuable option
> that does not require expensive upgraded to support BFR BIER.
>
>
>
> But you will require an even more expensive upgrade to support the SRv6
> network function to process the MDT subtree in the segment list.
>
>
>
> Acee
>
>
>
> I support WG adoption as stated and as this draft is very promising for
> operators as an SRv6 ā€œBIER alternativeā€.
>
>
>
> Thanks
>
>
>
> Gyan
>
>
>
> On Wed, Jan 27, 2021 at 9:13 PM Jeffrey (Zhaohui) Zhang <zzhang=
> 40juniper.net@dmarc.ietf.org> wrote:
>
> Hi Huamo,
>
>
>
> ā€œFor a network with 4k nodesā€, how large could a multicast tree be, and
> with draft-chen, how do you encode that tree in packets? How many copies
> would you send if each copy is only for a sub-tree?
>
>
>
> A large network can be divided into BIER sub-domains, each with a smaller
> bitstring. If you use tunnel segmentation then you can have efficient
> replication all the way through (though the segmentation points will have
> per-tree state). If you donā€™t want to use segmentation, the ingress can
> tunnel one copy to each sub-domain and inside each sub-domain you can have
> efficient replication.
>
>
>
> Jeffrey
>
>
>
> *From:* Huaimo Chen <huaimo.chen@futurewei.com>
> *Sent:* Tuesday, January 26, 2021 10:39 PM
> *To:* Jeffrey (Zhaohui) Zhang <zzhang@juniper.net>; Jeffrey (Zhaohui)
> Zhang <zzhang=40juniper.net@dmarc.ietf.org>; Bidgoli, Hooman (Nokia -
> CA/Ottawa) <hooman.bidgoli@nokia.com>; Stig Venaas <stig@venaas.com>;
> pim@ietf.org; 'Toerless Eckert' <tte@cs.fau.de>
> *Subject:* Re: [pim] pim wg adoption call for
> draft-chen-pim-srv6-p2mp-path-01
>
>
>
> *[External Email. Be cautious of content]*
>
>
>
> Hi Jeffrey,
>
>
>
>     Thanks for your further comments.
>
>     My responses are inline below with prefix [HC4].
>
>
>
> Best Regards,
>
> Huaimo
> ------------------------------
>
> *From:* Jeffrey (Zhaohui) Zhang <zzhang@juniper.net>
> *Sent:* Monday, January 25, 2021 10:08 PM
> *To:* Huaimo Chen <huaimo.chen@futurewei.com>; Jeffrey (Zhaohui) Zhang <
> zzhang=40juniper.net@dmarc.ietf.org>; Bidgoli, Hooman (Nokia - CA/Ottawa)
> <hooman.bidgoli@nokia.com>; Stig Venaas <stig@venaas.com>; pim@ietf.org <
> pim@ietf.org>; 'Toerless Eckert' <tte@cs.fau.de>
> *Subject:* RE: [pim] pim wg adoption call for
> draft-chen-pim-srv6-p2mp-path-01
>
>
>
> Hi Huaimo,
>
>
>
> Let me pull two points to the top of this email.
>
>
>
>
>
> [HC3]: This is for the case where the overhead in a BIER-TE packet is 10k
> bits
>
> when one BitString without any set is used. For the same overhead, how
> many links
>
> can be encoded.
>
> Zzh3> BIER-TE wonā€™t use 10k bits in the packet. It will use a finite
> number of bits (e.g. 512 bits or  1024 bits) with multiple sets, just like
> you break the entire tree into different sub-trees. The comparison is, say
> with 1024 bits as encoding space, how many nodes/links can each solution
> encode. I donā€™t see how draft-chen can do better.
>
> [HC4]: For a network with 4k nodes, there can be about 32k links and 64k
> BitPositions.
>
> When we use 1024 bits with multiple sets in the packet header,
>
> the number of sets and the size of the set need to represent 64k
> BitPositions,
>
> but 1024 bits contains a sub-tree that can be encoded in 1024 bits.
>
> If the size of the set is 64 bits, the number of sets needs to be 10 bits.
>
> Each set is encoded using 74 (64 bits for BitString + 10 bits for SI) bits.
>
> 1024 bits can contain 1024/74 = 13 sets.
>
> For a sub-tree traversing 13 links with BitPositions in 13 sets,
>
> 1024 bits encodes 13 links of this sub-tree in BIER-TE.
>
> With a good SID compression, the size of compressed SID can be 21 bits
>
> (12 bits for node IDs, 3 bits for branches, 6 bits for number of SIDs),
>
> 1024 bits can encode 1024/21 = 48 links/nodes of a sub-tree in draft-chen.
>
>
>
> [HC3]: It should be a new forwarding scheme for SRv6 replication segment.
>
> Under the network programming, replication SID is a new type of SID.
>
> It will have a new program/procedure for the behavior of this new SID
>
> even though it is ā€œlikeā€ existing MPLS P2MP.
>
> Zzh3> The general SRv6 network programing idea is that an IPv6 address is
> broken into the locator part and func/arg part. The locator part gets
> traffic to a node and the func/argc part is looked up to decide what to do
> next. SRv6-P2MP follows that perfectly, so **nothing is new**. When I
> said it is ā€œlikeā€ existing MPLS P2MP, I should also have said that it uses *
> *existing** SRv6 mechanism.
>
> Zzh3> Jeffrey
>
>
>
> *From:* Huaimo Chen <huaimo.chen@futurewei.com>
> *Sent:* Monday, January 25, 2021 11:45 AM
> *To:* Jeffrey (Zhaohui) Zhang <zzhang@juniper.net>; Jeffrey (Zhaohui)
> Zhang <zzhang=40juniper.net@dmarc.ietf.org>; Bidgoli, Hooman (Nokia -
> CA/Ottawa) <hooman.bidgoli@nokia.com>; Stig Venaas <stig@venaas.com>;
> pim@ietf.org; 'Toerless Eckert' <tte@cs.fau.de>
> *Subject:* Re: [pim] pim wg adoption call for
> draft-chen-pim-srv6-p2mp-path-01
>
>
>
> *[External Email. Be cautious of content]*
>
>
>
> Hi Jeffrey,
>
>
>
>     Thanks for your further comments.
>
>     My responses are inline below with prefix [HC3].
>
>
>
> Best Regards,
>
> Huaimo
>
>
> ------------------------------
>
> *From:* Jeffrey (Zhaohui) Zhang <zzhang@juniper.net>
> *Sent:* Sunday, January 24, 2021 11:26 PM
> *To:* Huaimo Chen <huaimo.chen@futurewei.com>; Jeffrey (Zhaohui) Zhang <
> zzhang=40juniper.net@dmarc.ietf.org>; Bidgoli, Hooman (Nokia - CA/Ottawa)
> <hooman.bidgoli@nokia.com>; Stig Venaas <stig@venaas.com>; pim@ietf.org <
> pim@ietf.org>; 'Toerless Eckert' <tte@cs.fau.de>
> *Subject:* RE: [pim] pim wg adoption call for
> draft-chen-pim-srv6-p2mp-path-01
>
>
>
> Hi Huaimo,
>
>
>
> Please see zzh2> below. I trimmed some text.
>
>
>
>
>
> Zzh> Plus the uSID-Block-ID, for each SRv6 SID that is needed? More below.
>
> [HC2]: The size of compressed SID (u bits) is the average size of
>
> a compressed SID. If compressing method uses uSID-Block-ID, it is
> considered
>
> in the average size. There are a few of methods for compressing SIDs.
>
> They may have different compression rates.
>
>
>
> Zzh2> Before WG adoption, the draft needs to have details spelled out with
> one of the compression methods. I donā€™t think it scales even w/
> compression, but itā€™s a non-starter at all w/o compression.
>
> [HC3]: The details about one compression method is posted in one of my
>
> previous responses.
>
> I believe that it is scalable with compression. The details in the example
>
> of one compression method illustrated, on average, one link on a multicast
>
> tree uses 32 bits. This can be improved to 20 bits per link.
>
> It can be a starter without compression in the following way.
>
> For a P2MP/multicast path/tree, the tree can be "split" into multiple
> sub-trees
>
> such that each of the sub-trees can be encoded in the segment list of the
>
> finite size.
>
>
>
> Suppose that the size of SI and BitString are s and b bits respectively,
> and
>
> the maximum number of possible BitPositions is M bits, the overhead of
> BIER-TE
>
> is M bits if a maximum BitString is used,
>
> (in this case, M may greater than N*u. For example, when M = 10k, u = 40,
>
> M > N*u for N < 10k/40)
>
> up to N*(s+b) if bit sets are used.
>
> (in this case, N*(s+b) may greater than N*u. For example, when b = 64, s =
> 10,
>
> u = 40, N*(s+b) > N*u )
>
>
>
> zzh> Do you mean ā€œM = 10, 000ā€?
>
> [HC2]: This is an example value for M. For a network with 10k links, there
> are 20k bitpositions.
>
> On average, the maximum bitposition is around 10k.
>
>
>
> zzh> The overhead consideration should be on a per-packet base. How much
> of a sub-tree can you encode in **one** packet with a header of a certain
> size? Your calculation does not seem be for that (and I have trouble
> following it).
>
> [HC2]: With overhead M bits in a packet, M/u links can be encoded.
>
>
>
> Zzh2> I assume you wonā€™t have 10k bits in a packet for encoding the
> (sub-)tree, right? So the key is how many bits in one packet you can use
> for encoding the (sub-)tree, and how many leaf/replication nodes you can
> fit in.
>
> [HC3]: This is for the case where the overhead in a BIER-TE packet is 10k
> bits
>
> when one BitString without any set is used. For the same overhead, how
> many links
>
> can be encoded.
>
>
>
> Zzh> Additionally, the text seems to assume/imply that there is a
> continuous block of uSIDs. But with 40-bit uSIDs, you can only put two
> uSIDs in an SRv6 SID? How large would the segment list be if you want to
> encode a reasonably sized sub-tree?
>
> [HC2]: "assume" is used to name a variable such as "assume" that
>
> the size of compressed SID is u bits. Here u is a variable for the
>
> the size of compressed SID on average. Some example values for the
>
> variables are given. There seems no imply.
>
> There are a few of methods for compressing SIDs. The method using uSIDs
>
> is one of them. The size of the segment list for a sub-tree is u*N bits,
>
> where N is the number of links on the sub-tree.
>
>
>
> Zzh2> How about putting some uSID details in the draft as I mentioned in
> the other email?
>
> [HC3]: We will add some details in the draft.
>
>
> 3. If this did not involve new forwarding scheme (i.e. new hardware) one
> could argue that multiple solutions could be developed. But given that this
> does need new hardware and does not do better than alternatives (e.g.
> BIER-TE), why bother.
>
> [HC]: It seems that both SR-P2MP and BIER-TE are involved with new
> forwarding schemes.
>
> Zzh2> SR-P2MP with MPLS does NOT involve new forwarding scheme at all.
> Itā€™s the same as existing mLDP/RSVP-TE P2MP in the forwarding plane. For
> SRv6 replication segment, itā€™s also ā€œlikeā€ existing MPLS P2MP in that part
> of the IPv6 address is used as lookup key just like a label (and that is
> not different from the SRv6 VPN).
>
> [HC3]: It should be a new forwarding scheme for SRv6 replication segment.
>
> Under the network programming, replication SID is a new type of SID.
>
> It will have a new program/procedure for the behavior of this new SID
>
> even though it is ā€œlikeā€ existing MPLS P2MP.
>
> Zzh2> The scheme in draft-chen is so much different - it encodes a
> sub-tree and forwarding is certainly more complicated. If it does scale
> well, then  it is worth it even as a new forwarding scheme, but that has
> not been concluded.
>
> [HC3]: Under the network programming, the behavior of the forwarding
>
> is a new program/procedure, which uses existing encap with the SIDs of a
>
> sub-tree. It seems not that complicated.
>
> Zzh2> BIER-TE is a new forwarding scheme, but it is by far the best
> per-tunnel TE solution for multicast that does not have per-tunnel state
> inside the network, and it went through due scrutiny. Draft-chen needs to
> be scrutinized as well, and it needs to show that it does better to proceed
> as yet another new forwarding scheme.
>
> [HC3]: Whether it is the best depends on a number of factors, some of them
>
> are explained in previous responses.
>
> Zzh2> Jeffrey
>
> Zzh> Because of new forwarding schemes, BIER went through the process of
> forming a new WG, and BIER-TE was originally on the experimental track.
> More importantly, my earlier point is that the new scheme should scale
> better (at least as well) for it to be worthy.
>
> [HC2]: The new forwarding scheme (or say forwarding behavior) in our draft
>
> can be defined in a small program/procedure in the network programming.
> This
>
> is very different from that in BIER, where a new WG is formed.
>
> We will update the draft to address your concerns and make it clearer.
>
> Your earlier point is responded in my previous response.
>
>
>
> Zzh> Jeffrey
>
>
>
> Juniper Business Use Only
>
>
>
> Juniper Business Use Only
>
>
>
> Juniper Business Use Only
>
>
>
> Juniper Business Use Only
>
>
>
> Juniper Business Use Only
>
>
>
> Juniper Business Use Only
>
>
>
> Juniper Business Use Only
>
>
>
> Juniper Business Use Only
>
> _______________________________________________
> pim mailing list
> pim@ietf.org
> https://www.ietf.org/mailman/listinfo/pim
>
> --
>
> [image: Image removed by sender.] <http://www.verizon.com/>
>
> *Gyan Mishra*
>
> *Network Solutions Architect *
>
>
>
> *M 301 502-1347 13101 Columbia Pike  *Silver Spring, MD
>
>
> _______________________________________________
> pim mailing list
> pim@ietf.org
> https://www.ietf.org/mailman/listinfo/pim
>