Re: [pim] pim wg adoption call for draft-chen-pim-srv6-p2mp-path-01

Gyan Mishra <hayabusagsm@gmail.com> Thu, 28 January 2021 23:27 UTC

Return-Path: <hayabusagsm@gmail.com>
X-Original-To: pim@ietfa.amsl.com
Delivered-To: pim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 5FD493A0E02 for <pim@ietfa.amsl.com>; Thu, 28 Jan 2021 15:27:49 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.087
X-Spam-Level:
X-Spam-Status: No, score=-2.087 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_REMOTE_IMAGE=0.01, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 5f2kvqm52nmu for <pim@ietfa.amsl.com>; Thu, 28 Jan 2021 15:27:45 -0800 (PST)
Received: from mail-pj1-x102d.google.com (mail-pj1-x102d.google.com [IPv6:2607:f8b0:4864:20::102d]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1C51D3A0D81 for <pim@ietf.org>; Thu, 28 Jan 2021 15:27:45 -0800 (PST)
Received: by mail-pj1-x102d.google.com with SMTP id a20so4783099pjs.1 for <pim@ietf.org>; Thu, 28 Jan 2021 15:27:45 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=WsaptjZ64BSOOYQBQR2nplPz+KoaADZovV+605WE44w=; b=NdIocnmQEZFltGfofBUQf4LIre1mzyerMv2grTvRrc3/blUWm3rmG0VRvKMLwdRn8D kP1s+dxb53RKsrUTlyGv/uz6OvhaaT8NgMFjNiY5VbX4pDM5qNgTL/1HqcX/ITVei30I PuwNWUMoCxFbkbJXKzjx/B6XW34hiuZ+8x9Y86vndDt/bSXP1kT19ltEl3LcoDKrEvAq lDEeJjsNkyk+6r8d2FEm5+XJwQJAurPUpZdRsehNVVTu8R+tkGKeZdQJus6+EoAkBn/S mttp5pKqKTnh5RrG3f7D1fVeGKJV6eQWWcveWZpsfUXJORJrtQMwDSLRDvFndS/fmnxJ ypSw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=WsaptjZ64BSOOYQBQR2nplPz+KoaADZovV+605WE44w=; b=LPJ1Y1h0iCmc7KVH95zry1zGnZGJTIAAP801oUlzaiKnrFjeqYdTi5ayP+Lv2eruyO f6dTyin5/fuMHaGAOTeMpY6XrqI2WCIUXXSkpJQtfKLT8X9yGBfifaCDIxDwwvhpmiFo qo82bGmLslRj+zrh/7EbRlvZ9Ite/218yXkv/N3eyBw7lD28mUZ8GiwkgreI/638NogJ tq0deAwlFx49xNoj4cmKda+7JvR6+bT/u4EO4LO9jspiUpt+krq5AXvt/3SCIqOpof+I ieI2g2JNjFxVLDGkCYIAUBs1m5aC/QOauevqmEqcyY+/Ap1Z+gBQFIQTPxTJTAKM74QC P4qw==
X-Gm-Message-State: AOAM530GpD0LH7Ue72E9EFTEPweuO40JdM7t1c4DX8xIWElCt18wktk7 /eHUIpjSjHYxnCOa9hzaONwmswEIZt/ppJcDZZE=
X-Google-Smtp-Source: ABdhPJz1IMUqvguObWDq8LvDCIAiHwhigYFeBLB6ohlONTaWJpI68wE6oEnDunFq1rdQaD8rVk0Zcv8HCGLd0juOMj4=
X-Received: by 2002:a17:902:c94e:b029:de:ae4c:99b3 with SMTP id i14-20020a170902c94eb02900deae4c99b3mr1472033pla.50.1611876463636; Thu, 28 Jan 2021 15:27:43 -0800 (PST)
MIME-Version: 1.0
References: <CAHANBtLs2F+x9ny8Jv=qe-28dFubcQL=k8bYXO4sybBr_Zpe5Q@mail.gmail.com> <DM6PR08MB397800060F4718FAAC2D947C91AA0@DM6PR08MB3978.namprd08.prod.outlook.com> <MN2PR05MB598196F87D3F4764EE13506AD4AA0@MN2PR05MB5981.namprd05.prod.outlook.com> <MN2PR13MB40879EEF5991503823F09D31F2A69@MN2PR13MB4087.namprd13.prod.outlook.com> <MN2PR05MB598150F8FA540A0426F6F8AED4A39@MN2PR05MB5981.namprd05.prod.outlook.com> <MN2PR13MB4087C8ED6C857764A1A13821F2A09@MN2PR13MB4087.namprd13.prod.outlook.com> <MN2PR05MB59814F815F09A61B8CDA32E7D4BD9@MN2PR05MB5981.namprd05.prod.outlook.com> <MN2PR13MB40870CD18814B56E04DD0520F2BD9@MN2PR13MB4087.namprd13.prod.outlook.com> <MN2PR05MB59815F695AD065F5C32AB1EDD4BC9@MN2PR05MB5981.namprd05.prod.outlook.com> <MN2PR13MB4087E8E538DDD5FC6F8C6443F2BB9@MN2PR13MB4087.namprd13.prod.outlook.com> <MN2PR05MB598123E73199AB595782A0D8D4BA9@MN2PR05MB5981.namprd05.prod.outlook.com> <CABNhwV2CQ=2GQgpeO5bVeUjhjO31rr9D7J=a=3QNbLfj90GGqw@mail.gmail.com> <C445E577-5B72-4EC9-A9F5-1E596558F1F1@cisco.com>
In-Reply-To: <C445E577-5B72-4EC9-A9F5-1E596558F1F1@cisco.com>
From: Gyan Mishra <hayabusagsm@gmail.com>
Date: Thu, 28 Jan 2021 18:27:32 -0500
Message-ID: <CABNhwV0d8w8822Ha8d0gQ0M_krHf6EQ17gdzYiFMuwBY4qijsw@mail.gmail.com>
To: "Acee Lindem (acee)" <acee@cisco.com>
Cc: "Jeffrey (Zhaohui) Zhang" <zzhang=40juniper.net@dmarc.ietf.org>, Toerless Eckert <tte@cs.fau.de>, "pim@ietf.org" <pim@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000097b3005b9fe3cbd"
Archived-At: <https://mailarchive.ietf.org/arch/msg/pim/tjAE7Gz07aivvMBiFBX385fO6XM>
Subject: Re: [pim] pim wg adoption call for draft-chen-pim-srv6-p2mp-path-01
X-BeenThere: pim@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Protocol Independent Multicast <pim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/pim>, <mailto:pim-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/pim/>
List-Post: <mailto:pim@ietf.org>
List-Help: <mailto:pim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/pim>, <mailto:pim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Jan 2021 23:27:49 -0000

Understood.

What is nice about SRv6 is you have the PHP function to support PE’s not
yet SRv6 capable SRH processing as well as SRv6-BE w/o SRH that you can
phase your brown field deployments to being fully SRv6 capable  over time
so you don’t have to upgraded all nodes all at once unless you are building
a brand new greenfield core.

Kind Regards

Gyan

On Thu, Jan 28, 2021 at 10:45 AM Acee Lindem (acee) <acee@cisco.com> wrote:

> Hi Gyan,
>
>
>
> *From: *pim <pim-bounces@ietf.org> on behalf of Gyan Mishra <
> hayabusagsm@gmail.com>
> *Date: *Thursday, January 28, 2021 at 8:48 AM
> *To: *"Jeffrey (Zhaohui) Zhang" <zzhang=40juniper.net@dmarc.ietf.org>
> *Cc: *Toerless Eckert <tte@cs.fau.de>de>, "pim@ietf.org" <pim@ietf.org>
> *Subject: *Re: [pim] pim wg adoption call for
> draft-chen-pim-srv6-p2mp-path-01
>
>
>
> All,
>
>
>
> Huaimo has provided some valuable responses to scaling issues.  One point
> to note as far as scaling is that most all provider run Jumbo frames 9216
> MTU and IPv4 length is 16 bits which can support up to 65,535 size packet.
>
>
>
> IPv6 header natively supports a 16 bit header length which is the same as
> IPv4.
>
>
>
> RFC 2765 specifies a hbh options Jumbo payload option that carried a 32
> bit length field 65,536 and 4,294,967,295 octets in length. Packets with
> such long payloads are referred to as
>
>    "jumbograms".
>
>
>
> All router and switch vendors today support super jumbo which is payload
> length between 9000 and 65535 capping today at 9216.
>
>
>
> In the future as memory and buffer sizes increase in hardware fixed
> function ASIC FPGA or programmable NPU  for packet IO processing the sizes
> will eventually jump to 65535.  The next big step will be jumbo grams.  😀
>
>
>
> With SRv6 unlike SR-MPLS traditional MPLS based PTA cannot be leveraged
> such as RAVP-TE or mLDP, so the options for P2MP MDT is very limited and If
> you don’t want the source node overhead of IR or SR replication SID, you
> just have BIER as your only option.
>
>
>
> Operators love having options and this draft provisss a valuable option
> that does not require expensive upgraded to support BFR BIER.
>
>
>
> But you will require an even more expensive upgrade to support the SRv6
> network function to process the MDT subtree in the segment list.
>
>
>
> Acee
>
>
>
> I support WG adoption as stated and as this draft is very promising for
> operators as an SRv6 “BIER alternative”.
>
>
>
> Thanks
>
>
>
> Gyan
>
>
>
> On Wed, Jan 27, 2021 at 9:13 PM Jeffrey (Zhaohui) Zhang <zzhang=
> 40juniper.net@dmarc.ietf.org> wrote:
>
> Hi Huamo,
>
>
>
> “For a network with 4k nodes”, how large could a multicast tree be, and
> with draft-chen, how do you encode that tree in packets? How many copies
> would you send if each copy is only for a sub-tree?
>
>
>
> A large network can be divided into BIER sub-domains, each with a smaller
> bitstring. If you use tunnel segmentation then you can have efficient
> replication all the way through (though the segmentation points will have
> per-tree state). If you don’t want to use segmentation, the ingress can
> tunnel one copy to each sub-domain and inside each sub-domain you can have
> efficient replication.
>
>
>
> Jeffrey
>
>
>
> *From:* Huaimo Chen <huaimo.chen@futurewei.com>
> *Sent:* Tuesday, January 26, 2021 10:39 PM
> *To:* Jeffrey (Zhaohui) Zhang <zzhang@juniper.net>et>; Jeffrey (Zhaohui)
> Zhang <zzhang=40juniper.net@dmarc.ietf.org>rg>; Bidgoli, Hooman (Nokia -
> CA/Ottawa) <hooman.bidgoli@nokia.com>om>; Stig Venaas <stig@venaas.com>om>;
> pim@ietf.org; 'Toerless Eckert' <tte@cs.fau.de>
> *Subject:* Re: [pim] pim wg adoption call for
> draft-chen-pim-srv6-p2mp-path-01
>
>
>
> *[External Email. Be cautious of content]*
>
>
>
> Hi Jeffrey,
>
>
>
>     Thanks for your further comments.
>
>     My responses are inline below with prefix [HC4].
>
>
>
> Best Regards,
>
> Huaimo
> ------------------------------
>
> *From:* Jeffrey (Zhaohui) Zhang <zzhang@juniper.net>
> *Sent:* Monday, January 25, 2021 10:08 PM
> *To:* Huaimo Chen <huaimo.chen@futurewei.com>om>; Jeffrey (Zhaohui) Zhang <
> zzhang=40juniper.net@dmarc.ietf.org>gt;; Bidgoli, Hooman (Nokia - CA/Ottawa)
> <hooman.bidgoli@nokia.com>om>; Stig Venaas <stig@venaas.com>om>; pim@ietf.org <
> pim@ietf.org>gt;; 'Toerless Eckert' <tte@cs.fau.de>
> *Subject:* RE: [pim] pim wg adoption call for
> draft-chen-pim-srv6-p2mp-path-01
>
>
>
> Hi Huaimo,
>
>
>
> Let me pull two points to the top of this email.
>
>
>
>
>
> [HC3]: This is for the case where the overhead in a BIER-TE packet is 10k
> bits
>
> when one BitString without any set is used. For the same overhead, how
> many links
>
> can be encoded.
>
> Zzh3> BIER-TE won’t use 10k bits in the packet. It will use a finite
> number of bits (e.g. 512 bits or  1024 bits) with multiple sets, just like
> you break the entire tree into different sub-trees. The comparison is, say
> with 1024 bits as encoding space, how many nodes/links can each solution
> encode. I don’t see how draft-chen can do better.
>
> [HC4]: For a network with 4k nodes, there can be about 32k links and 64k
> BitPositions.
>
> When we use 1024 bits with multiple sets in the packet header,
>
> the number of sets and the size of the set need to represent 64k
> BitPositions,
>
> but 1024 bits contains a sub-tree that can be encoded in 1024 bits.
>
> If the size of the set is 64 bits, the number of sets needs to be 10 bits.
>
> Each set is encoded using 74 (64 bits for BitString + 10 bits for SI) bits.
>
> 1024 bits can contain 1024/74 = 13 sets.
>
> For a sub-tree traversing 13 links with BitPositions in 13 sets,
>
> 1024 bits encodes 13 links of this sub-tree in BIER-TE.
>
> With a good SID compression, the size of compressed SID can be 21 bits
>
> (12 bits for node IDs, 3 bits for branches, 6 bits for number of SIDs),
>
> 1024 bits can encode 1024/21 = 48 links/nodes of a sub-tree in draft-chen.
>
>
>
> [HC3]: It should be a new forwarding scheme for SRv6 replication segment.
>
> Under the network programming, replication SID is a new type of SID.
>
> It will have a new program/procedure for the behavior of this new SID
>
> even though it is “like” existing MPLS P2MP.
>
> Zzh3> The general SRv6 network programing idea is that an IPv6 address is
> broken into the locator part and func/arg part. The locator part gets
> traffic to a node and the func/argc part is looked up to decide what to do
> next. SRv6-P2MP follows that perfectly, so **nothing is new**. When I
> said it is “like” existing MPLS P2MP, I should also have said that it uses *
> *existing** SRv6 mechanism.
>
> Zzh3> Jeffrey
>
>
>
> *From:* Huaimo Chen <huaimo.chen@futurewei.com>
> *Sent:* Monday, January 25, 2021 11:45 AM
> *To:* Jeffrey (Zhaohui) Zhang <zzhang@juniper.net>et>; Jeffrey (Zhaohui)
> Zhang <zzhang=40juniper.net@dmarc.ietf.org>rg>; Bidgoli, Hooman (Nokia -
> CA/Ottawa) <hooman.bidgoli@nokia.com>om>; Stig Venaas <stig@venaas.com>om>;
> pim@ietf.org; 'Toerless Eckert' <tte@cs.fau.de>
> *Subject:* Re: [pim] pim wg adoption call for
> draft-chen-pim-srv6-p2mp-path-01
>
>
>
> *[External Email. Be cautious of content]*
>
>
>
> Hi Jeffrey,
>
>
>
>     Thanks for your further comments.
>
>     My responses are inline below with prefix [HC3].
>
>
>
> Best Regards,
>
> Huaimo
>
>
> ------------------------------
>
> *From:* Jeffrey (Zhaohui) Zhang <zzhang@juniper.net>
> *Sent:* Sunday, January 24, 2021 11:26 PM
> *To:* Huaimo Chen <huaimo.chen@futurewei.com>om>; Jeffrey (Zhaohui) Zhang <
> zzhang=40juniper.net@dmarc.ietf.org>gt;; Bidgoli, Hooman (Nokia - CA/Ottawa)
> <hooman.bidgoli@nokia.com>om>; Stig Venaas <stig@venaas.com>om>; pim@ietf.org <
> pim@ietf.org>gt;; 'Toerless Eckert' <tte@cs.fau.de>
> *Subject:* RE: [pim] pim wg adoption call for
> draft-chen-pim-srv6-p2mp-path-01
>
>
>
> Hi Huaimo,
>
>
>
> Please see zzh2> below. I trimmed some text.
>
>
>
>
>
> Zzh> Plus the uSID-Block-ID, for each SRv6 SID that is needed? More below.
>
> [HC2]: The size of compressed SID (u bits) is the average size of
>
> a compressed SID. If compressing method uses uSID-Block-ID, it is
> considered
>
> in the average size. There are a few of methods for compressing SIDs.
>
> They may have different compression rates.
>
>
>
> Zzh2> Before WG adoption, the draft needs to have details spelled out with
> one of the compression methods. I don’t think it scales even w/
> compression, but it’s a non-starter at all w/o compression.
>
> [HC3]: The details about one compression method is posted in one of my
>
> previous responses.
>
> I believe that it is scalable with compression. The details in the example
>
> of one compression method illustrated, on average, one link on a multicast
>
> tree uses 32 bits. This can be improved to 20 bits per link.
>
> It can be a starter without compression in the following way.
>
> For a P2MP/multicast path/tree, the tree can be "split" into multiple
> sub-trees
>
> such that each of the sub-trees can be encoded in the segment list of the
>
> finite size.
>
>
>
> Suppose that the size of SI and BitString are s and b bits respectively,
> and
>
> the maximum number of possible BitPositions is M bits, the overhead of
> BIER-TE
>
> is M bits if a maximum BitString is used,
>
> (in this case, M may greater than N*u. For example, when M = 10k, u = 40,
>
> M > N*u for N < 10k/40)
>
> up to N*(s+b) if bit sets are used.
>
> (in this case, N*(s+b) may greater than N*u. For example, when b = 64, s =
> 10,
>
> u = 40, N*(s+b) > N*u )
>
>
>
> zzh> Do you mean “M = 10, 000”?
>
> [HC2]: This is an example value for M. For a network with 10k links, there
> are 20k bitpositions.
>
> On average, the maximum bitposition is around 10k.
>
>
>
> zzh> The overhead consideration should be on a per-packet base. How much
> of a sub-tree can you encode in **one** packet with a header of a certain
> size? Your calculation does not seem be for that (and I have trouble
> following it).
>
> [HC2]: With overhead M bits in a packet, M/u links can be encoded.
>
>
>
> Zzh2> I assume you won’t have 10k bits in a packet for encoding the
> (sub-)tree, right? So the key is how many bits in one packet you can use
> for encoding the (sub-)tree, and how many leaf/replication nodes you can
> fit in.
>
> [HC3]: This is for the case where the overhead in a BIER-TE packet is 10k
> bits
>
> when one BitString without any set is used. For the same overhead, how
> many links
>
> can be encoded.
>
>
>
> Zzh> Additionally, the text seems to assume/imply that there is a
> continuous block of uSIDs. But with 40-bit uSIDs, you can only put two
> uSIDs in an SRv6 SID? How large would the segment list be if you want to
> encode a reasonably sized sub-tree?
>
> [HC2]: "assume" is used to name a variable such as "assume" that
>
> the size of compressed SID is u bits. Here u is a variable for the
>
> the size of compressed SID on average. Some example values for the
>
> variables are given. There seems no imply.
>
> There are a few of methods for compressing SIDs. The method using uSIDs
>
> is one of them. The size of the segment list for a sub-tree is u*N bits,
>
> where N is the number of links on the sub-tree.
>
>
>
> Zzh2> How about putting some uSID details in the draft as I mentioned in
> the other email?
>
> [HC3]: We will add some details in the draft.
>
>
> 3. If this did not involve new forwarding scheme (i.e. new hardware) one
> could argue that multiple solutions could be developed. But given that this
> does need new hardware and does not do better than alternatives (e.g.
> BIER-TE), why bother.
>
> [HC]: It seems that both SR-P2MP and BIER-TE are involved with new
> forwarding schemes.
>
> Zzh2> SR-P2MP with MPLS does NOT involve new forwarding scheme at all.
> It’s the same as existing mLDP/RSVP-TE P2MP in the forwarding plane. For
> SRv6 replication segment, it’s also “like” existing MPLS P2MP in that part
> of the IPv6 address is used as lookup key just like a label (and that is
> not different from the SRv6 VPN).
>
> [HC3]: It should be a new forwarding scheme for SRv6 replication segment.
>
> Under the network programming, replication SID is a new type of SID.
>
> It will have a new program/procedure for the behavior of this new SID
>
> even though it is “like” existing MPLS P2MP.
>
> Zzh2> The scheme in draft-chen is so much different - it encodes a
> sub-tree and forwarding is certainly more complicated. If it does scale
> well, then  it is worth it even as a new forwarding scheme, but that has
> not been concluded.
>
> [HC3]: Under the network programming, the behavior of the forwarding
>
> is a new program/procedure, which uses existing encap with the SIDs of a
>
> sub-tree. It seems not that complicated.
>
> Zzh2> BIER-TE is a new forwarding scheme, but it is by far the best
> per-tunnel TE solution for multicast that does not have per-tunnel state
> inside the network, and it went through due scrutiny. Draft-chen needs to
> be scrutinized as well, and it needs to show that it does better to proceed
> as yet another new forwarding scheme.
>
> [HC3]: Whether it is the best depends on a number of factors, some of them
>
> are explained in previous responses.
>
> Zzh2> Jeffrey
>
> Zzh> Because of new forwarding schemes, BIER went through the process of
> forming a new WG, and BIER-TE was originally on the experimental track.
> More importantly, my earlier point is that the new scheme should scale
> better (at least as well) for it to be worthy.
>
> [HC2]: The new forwarding scheme (or say forwarding behavior) in our draft
>
> can be defined in a small program/procedure in the network programming.
> This
>
> is very different from that in BIER, where a new WG is formed.
>
> We will update the draft to address your concerns and make it clearer.
>
> Your earlier point is responded in my previous response.
>
>
>
> Zzh> Jeffrey
>
>
>
> Juniper Business Use Only
>
>
>
> Juniper Business Use Only
>
>
>
> Juniper Business Use Only
>
>
>
> Juniper Business Use Only
>
>
>
> Juniper Business Use Only
>
>
>
> Juniper Business Use Only
>
>
>
> Juniper Business Use Only
>
>
>
> Juniper Business Use Only
>
> _______________________________________________
> pim mailing list
> pim@ietf.org
> https://www.ietf.org/mailman/listinfo/pim
>
> --
>
> [image: Image removed by sender.] <http://www.verizon.com/>
>
> *Gyan Mishra*
>
> *Network Solutions Architect *
>
>
>
> *M 301 502-1347 13101 Columbia Pike
> <https://www.google.com/maps/search/13101+Columbia+Pike?entry=gmail&source=g>
> *Silver Spring, MD
>
>
>
-- 

<http://www.verizon.com/>

*Gyan Mishra*

*Network Solutions A**rchitect *



*M 301 502-134713101 Columbia Pike *Silver Spring, MD