Re: [pim] pim wg adoption call for draft-chen-pim-srv6-p2mp-path-01

Gyan Mishra <hayabusagsm@gmail.com> Thu, 28 January 2021 13:51 UTC

Return-Path: <hayabusagsm@gmail.com>
X-Original-To: pim@ietfa.amsl.com
Delivered-To: pim@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id E347E3A14FD for <pim@ietfa.amsl.com>; Thu, 28 Jan 2021 05:51:09 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.087
X-Spam-Level:
X-Spam-Status: No, score=-2.087 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_REMOTE_IMAGE=0.01, URIBL_BLOCKED=0.001] autolearn=unavailable autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id PfNS37jI34J9 for <pim@ietfa.amsl.com>; Thu, 28 Jan 2021 05:51:06 -0800 (PST)
Received: from mail-pg1-x531.google.com (mail-pg1-x531.google.com [IPv6:2607:f8b0:4864:20::531]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3CB183A14FB for <pim@ietf.org>; Thu, 28 Jan 2021 05:51:06 -0800 (PST)
Received: by mail-pg1-x531.google.com with SMTP id i7so4324061pgc.8 for <pim@ietf.org>; Thu, 28 Jan 2021 05:51:06 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=dpEfbhsLG4PyMKF0wbGYYvJ/dFkbPMtKTUM/nkW2Fgo=; b=aG31Rs/BsY6zh+CH5iuOMFmRrOFOROSM6/b8ZyriRqMBec4vneBiByCC1WLgCYHNEq jKYN0noPaMAaXyvt6KexT1xL7WLTaALF1siQ2mpihSj1+iy2G814nUDG88kKzmXC42gp hjLltjA4UqpnrR673TzWfsEdaDlZ1gJiGpc2lXEnaMX7omKQ/Ix6563O9Akt/O4ayU+1 CBgpvXjseMaisD6lPagXMxlpiapxEyAwPdcWVNNO5O5TTnCkBIbP+BIXuS9PV/cTKLUr 9uXFcxeS4doFYTGM39DicdDJ5dvUizmqVcCqCnplESKHE5Ac14dhou9C7oKwOqUMabds G5dg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=dpEfbhsLG4PyMKF0wbGYYvJ/dFkbPMtKTUM/nkW2Fgo=; b=sHFf494sLcaqgRKubVfsakDXYb1HfriVB/GIGGnYm0tBxRSmB2H7d8IqIAqgLvJ7kc +fL2REiCCZBnELcbpsjRq6YJ0vWxlmJw6jTIIJgbNqS1o3uHrD4QBBmynNyLNUhw7WhD /32e1/TF627CsUhEXN7P9E4obZvt8lXrZY0/gyqyb5YFJYt/V2+CILXVeO3q0i1zATp2 6bTk3FHS6M0Q+5QPBxfHzSOnuhE4FoxhOhRSXShO6+YndCbdUnUPWE2R7pKZ9Vyu78V+ pGMJcB29QWpGSEY+DWlL94aEcnxOEp9U/NVOi/xmbe0ItCElCTlOaL4EEQ+RLgEFrXXs 8o3A==
X-Gm-Message-State: AOAM532RVv1bIYclhzRlxHgSExW5O7VooRhabGNokjJH8LBbA4CAACtK i7UGOdJGZrn9D+f29M2bWHwb/bKCEpDZYASKeIxq71v/JIregQ==
X-Google-Smtp-Source: ABdhPJx17cKlhSQFzv05gu+alNl+GB33gekRTjUFW//2dZAsysIhJGsmqXuz6uCLqLoKz849mlUgHFLlfslcmBzzJYY=
X-Received: by 2002:a05:6a00:168d:b029:1ba:d500:1209 with SMTP id k13-20020a056a00168db02901bad5001209mr15904105pfc.4.1611841865569; Thu, 28 Jan 2021 05:51:05 -0800 (PST)
MIME-Version: 1.0
References: <CAHANBtLs2F+x9ny8Jv=qe-28dFubcQL=k8bYXO4sybBr_Zpe5Q@mail.gmail.com> <DM6PR08MB397800060F4718FAAC2D947C91AA0@DM6PR08MB3978.namprd08.prod.outlook.com> <MN2PR05MB598196F87D3F4764EE13506AD4AA0@MN2PR05MB5981.namprd05.prod.outlook.com> <MN2PR13MB40879EEF5991503823F09D31F2A69@MN2PR13MB4087.namprd13.prod.outlook.com> <MN2PR05MB598150F8FA540A0426F6F8AED4A39@MN2PR05MB5981.namprd05.prod.outlook.com> <MN2PR13MB4087C8ED6C857764A1A13821F2A09@MN2PR13MB4087.namprd13.prod.outlook.com> <MN2PR05MB59814F815F09A61B8CDA32E7D4BD9@MN2PR05MB5981.namprd05.prod.outlook.com> <MN2PR13MB40870CD18814B56E04DD0520F2BD9@MN2PR13MB4087.namprd13.prod.outlook.com> <MN2PR05MB59815F695AD065F5C32AB1EDD4BC9@MN2PR05MB5981.namprd05.prod.outlook.com> <MN2PR13MB4087E8E538DDD5FC6F8C6443F2BB9@MN2PR13MB4087.namprd13.prod.outlook.com> <MN2PR05MB598123E73199AB595782A0D8D4BA9@MN2PR05MB5981.namprd05.prod.outlook.com> <CABNhwV2CQ=2GQgpeO5bVeUjhjO31rr9D7J=a=3QNbLfj90GGqw@mail.gmail.com>
In-Reply-To: <CABNhwV2CQ=2GQgpeO5bVeUjhjO31rr9D7J=a=3QNbLfj90GGqw@mail.gmail.com>
From: Gyan Mishra <hayabusagsm@gmail.com>
Date: Thu, 28 Jan 2021 08:50:54 -0500
Message-ID: <CABNhwV3ERswvzpQ5GokwnTOYxVTSOe5QqinZZROLf1t1rn2mUQ@mail.gmail.com>
To: "Jeffrey (Zhaohui) Zhang" <zzhang=40juniper.net@dmarc.ietf.org>
Cc: "Bidgoli, Hooman (Nokia - CA/Ottawa)" <hooman.bidgoli@nokia.com>, Huaimo Chen <huaimo.chen@futurewei.com>, Stig Venaas <stig@venaas.com>, Toerless Eckert <tte@cs.fau.de>, "pim@ietf.org" <pim@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000d4dc5705b9f62d55"
Archived-At: <https://mailarchive.ietf.org/arch/msg/pim/FzFir2KxewGjrlr2eB1QXn2GPTU>
Subject: Re: [pim] pim wg adoption call for draft-chen-pim-srv6-p2mp-path-01
X-BeenThere: pim@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Protocol Independent Multicast <pim.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/pim>, <mailto:pim-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/pim/>
List-Post: <mailto:pim@ietf.org>
List-Help: <mailto:pim-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/pim>, <mailto:pim-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 28 Jan 2021 13:51:10 -0000

See IPv6 Jumbo grams in IANA IPv6 parameters codepoint.

https://www.iana.org/assignments/ipv6-parameters/ipv6-parameters.xhtml

0xC2 11 0 00010 Jumbo Payload [RFC2675 <https://www.iana.org/go/rfc2675>]

On Thu, Jan 28, 2021 at 8:48 AM Gyan Mishra <hayabusagsm@gmail.com> wrote:

> All,
>
> Huaimo has provided some valuable responses to scaling issues.  One point
> to note as far as scaling is that most all provider run Jumbo frames 9216
> MTU and IPv4 length is 16 bits which can support up to 65,535 size packet.
>
> IPv6 header natively supports a 16 bit header length which is the same as
> IPv4.
>
> RFC 2765 specifies a hbh options Jumbo payload option that carried a 32
> bit length field 65,536 and 4,294,967,295 octets in length. Packets with
> such long payloads are referred to as
>
>    "jumbograms".
>
>
> All router and switch vendors today support super jumbo which is payload
> length between 9000 and 65535 capping today at 9216.
>
> In the future as memory and buffer sizes increase in hardware fixed
> function ASIC FPGA or programmable NPU  for packet IO processing the sizes
> will eventually jump to 65535.  The next big step will be jumbo grams.  😀
>
> With SRv6 unlike SR-MPLS traditional MPLS based PTA cannot be leveraged
> such as RAVP-TE or mLDP, so the options for P2MP MDT is very limited and If
> you don’t want the source node overhead of IR or SR replication SID, you
> just have BIER as your only option.
>
> Operators love having options and this draft provisss a valuable option
> that does not require expensive upgraded to support BFR BIER.
>
> I support WG adoption as stated and as this draft is very promising for
> operators as an SRv6 “BIER alternative”.
>
> Thanks
>
> Gyan
>
> On Wed, Jan 27, 2021 at 9:13 PM Jeffrey (Zhaohui) Zhang <zzhang=
> 40juniper.net@dmarc.ietf.org> wrote:
>
>> Hi Huamo,
>>
>>
>>
>> “For a network with 4k nodes”, how large could a multicast tree be, and
>> with draft-chen, how do you encode that tree in packets? How many copies
>> would you send if each copy is only for a sub-tree?
>>
>>
>>
>> A large network can be divided into BIER sub-domains, each with a smaller
>> bitstring. If you use tunnel segmentation then you can have efficient
>> replication all the way through (though the segmentation points will have
>> per-tree state). If you don’t want to use segmentation, the ingress can
>> tunnel one copy to each sub-domain and inside each sub-domain you can have
>> efficient replication.
>>
>>
>>
>> Jeffrey
>>
>>
>>
>> *From:* Huaimo Chen <huaimo.chen@futurewei.com>
>> *Sent:* Tuesday, January 26, 2021 10:39 PM
>> *To:* Jeffrey (Zhaohui) Zhang <zzhang@juniper.net>et>; Jeffrey (Zhaohui)
>> Zhang <zzhang=40juniper.net@dmarc.ietf.org>rg>; Bidgoli, Hooman (Nokia -
>> CA/Ottawa) <hooman.bidgoli@nokia.com>om>; Stig Venaas <stig@venaas.com>om>;
>> pim@ietf.org; 'Toerless Eckert' <tte@cs.fau.de>
>> *Subject:* Re: [pim] pim wg adoption call for
>> draft-chen-pim-srv6-p2mp-path-01
>>
>>
>>
>> *[External Email. Be cautious of content]*
>>
>>
>>
>> Hi Jeffrey,
>>
>>
>>
>>     Thanks for your further comments.
>>
>>     My responses are inline below with prefix [HC4].
>>
>>
>>
>> Best Regards,
>>
>> Huaimo
>> ------------------------------
>>
>> *From:* Jeffrey (Zhaohui) Zhang <zzhang@juniper.net>
>> *Sent:* Monday, January 25, 2021 10:08 PM
>> *To:* Huaimo Chen <huaimo.chen@futurewei.com>om>; Jeffrey (Zhaohui) Zhang <
>> zzhang=40juniper.net@dmarc.ietf.org>gt;; Bidgoli, Hooman (Nokia -
>> CA/Ottawa) <hooman.bidgoli@nokia.com>om>; Stig Venaas <stig@venaas.com>om>;
>> pim@ietf.org <pim@ietf.org>rg>; 'Toerless Eckert' <tte@cs.fau.de>
>> *Subject:* RE: [pim] pim wg adoption call for
>> draft-chen-pim-srv6-p2mp-path-01
>>
>>
>>
>> Hi Huaimo,
>>
>>
>>
>> Let me pull two points to the top of this email.
>>
>>
>>
>>
>>
>> [HC3]: This is for the case where the overhead in a BIER-TE packet is 10k
>> bits
>>
>> when one BitString without any set is used. For the same overhead, how
>> many links
>>
>> can be encoded.
>>
>> Zzh3> BIER-TE won’t use 10k bits in the packet. It will use a finite
>> number of bits (e.g. 512 bits or  1024 bits) with multiple sets, just like
>> you break the entire tree into different sub-trees. The comparison is, say
>> with 1024 bits as encoding space, how many nodes/links can each solution
>> encode. I don’t see how draft-chen can do better.
>>
>> [HC4]: For a network with 4k nodes, there can be about 32k links and 64k
>> BitPositions.
>>
>> When we use 1024 bits with multiple sets in the packet header,
>>
>> the number of sets and the size of the set need to represent 64k
>> BitPositions,
>>
>> but 1024 bits contains a sub-tree that can be encoded in 1024 bits.
>>
>> If the size of the set is 64 bits, the number of sets needs to be 10 bits.
>>
>> Each set is encoded using 74 (64 bits for BitString + 10 bits for SI)
>> bits.
>>
>> 1024 bits can contain 1024/74 = 13 sets.
>>
>> For a sub-tree traversing 13 links with BitPositions in 13 sets,
>>
>> 1024 bits encodes 13 links of this sub-tree in BIER-TE.
>>
>> With a good SID compression, the size of compressed SID can be 21 bits
>>
>> (12 bits for node IDs, 3 bits for branches, 6 bits for number of SIDs),
>>
>> 1024 bits can encode 1024/21 = 48 links/nodes of a sub-tree in
>> draft-chen.
>>
>>
>>
>> [HC3]: It should be a new forwarding scheme for SRv6 replication segment.
>>
>> Under the network programming, replication SID is a new type of SID.
>>
>> It will have a new program/procedure for the behavior of this new SID
>>
>> even though it is “like” existing MPLS P2MP.
>>
>> Zzh3> The general SRv6 network programing idea is that an IPv6 address is
>> broken into the locator part and func/arg part. The locator part gets
>> traffic to a node and the func/argc part is looked up to decide what to do
>> next. SRv6-P2MP follows that perfectly, so **nothing is new**. When I
>> said it is “like” existing MPLS P2MP, I should also have said that it uses *
>> *existing** SRv6 mechanism.
>>
>> Zzh3> Jeffrey
>>
>>
>>
>> *From:* Huaimo Chen <huaimo.chen@futurewei.com>
>> *Sent:* Monday, January 25, 2021 11:45 AM
>> *To:* Jeffrey (Zhaohui) Zhang <zzhang@juniper.net>et>; Jeffrey (Zhaohui)
>> Zhang <zzhang=40juniper.net@dmarc.ietf.org>rg>; Bidgoli, Hooman (Nokia -
>> CA/Ottawa) <hooman.bidgoli@nokia.com>om>; Stig Venaas <stig@venaas.com>om>;
>> pim@ietf.org; 'Toerless Eckert' <tte@cs.fau.de>
>> *Subject:* Re: [pim] pim wg adoption call for
>> draft-chen-pim-srv6-p2mp-path-01
>>
>>
>>
>> *[External Email. Be cautious of content]*
>>
>>
>>
>> Hi Jeffrey,
>>
>>
>>
>>     Thanks for your further comments.
>>
>>     My responses are inline below with prefix [HC3].
>>
>>
>>
>> Best Regards,
>>
>> Huaimo
>>
>>
>> ------------------------------
>>
>> *From:* Jeffrey (Zhaohui) Zhang <zzhang@juniper.net>
>> *Sent:* Sunday, January 24, 2021 11:26 PM
>> *To:* Huaimo Chen <huaimo.chen@futurewei.com>om>; Jeffrey (Zhaohui) Zhang <
>> zzhang=40juniper.net@dmarc.ietf.org>gt;; Bidgoli, Hooman (Nokia -
>> CA/Ottawa) <hooman.bidgoli@nokia.com>om>; Stig Venaas <stig@venaas.com>om>;
>> pim@ietf.org <pim@ietf.org>rg>; 'Toerless Eckert' <tte@cs.fau.de>
>> *Subject:* RE: [pim] pim wg adoption call for
>> draft-chen-pim-srv6-p2mp-path-01
>>
>>
>>
>> Hi Huaimo,
>>
>>
>>
>> Please see zzh2> below. I trimmed some text.
>>
>>
>>
>>
>>
>> Zzh> Plus the uSID-Block-ID, for each SRv6 SID that is needed? More below.
>>
>> [HC2]: The size of compressed SID (u bits) is the average size of
>>
>> a compressed SID. If compressing method uses uSID-Block-ID, it is
>> considered
>>
>> in the average size. There are a few of methods for compressing SIDs.
>>
>> They may have different compression rates.
>>
>>
>>
>> Zzh2> Before WG adoption, the draft needs to have details spelled out
>> with one of the compression methods. I don’t think it scales even w/
>> compression, but it’s a non-starter at all w/o compression.
>>
>> [HC3]: The details about one compression method is posted in one of my
>>
>> previous responses.
>>
>> I believe that it is scalable with compression. The details in the
>> example
>>
>> of one compression method illustrated, on average, one link on a multicast
>>
>> tree uses 32 bits. This can be improved to 20 bits per link.
>>
>> It can be a starter without compression in the following way.
>>
>> For a P2MP/multicast path/tree, the tree can be "split" into multiple
>> sub-trees
>>
>> such that each of the sub-trees can be encoded in the segment list of the
>>
>> finite size.
>>
>>
>>
>> Suppose that the size of SI and BitString are s and b bits respectively,
>> and
>>
>> the maximum number of possible BitPositions is M bits, the overhead of
>> BIER-TE
>>
>> is M bits if a maximum BitString is used,
>>
>> (in this case, M may greater than N*u. For example, when M = 10k, u = 40,
>>
>> M > N*u for N < 10k/40)
>>
>> up to N*(s+b) if bit sets are used.
>>
>> (in this case, N*(s+b) may greater than N*u. For example, when b = 64, s
>> = 10,
>>
>> u = 40, N*(s+b) > N*u )
>>
>>
>>
>> zzh> Do you mean “M = 10, 000”?
>>
>> [HC2]: This is an example value for M. For a network with 10k links,
>> there are 20k bitpositions.
>>
>> On average, the maximum bitposition is around 10k.
>>
>>
>>
>> zzh> The overhead consideration should be on a per-packet base. How much
>> of a sub-tree can you encode in **one** packet with a header of a
>> certain size? Your calculation does not seem be for that (and I have
>> trouble following it).
>>
>> [HC2]: With overhead M bits in a packet, M/u links can be encoded.
>>
>>
>>
>> Zzh2> I assume you won’t have 10k bits in a packet for encoding the
>> (sub-)tree, right? So the key is how many bits in one packet you can use
>> for encoding the (sub-)tree, and how many leaf/replication nodes you can
>> fit in.
>>
>> [HC3]: This is for the case where the overhead in a BIER-TE packet is 10k
>> bits
>>
>> when one BitString without any set is used. For the same overhead, how
>> many links
>>
>> can be encoded.
>>
>>
>>
>> Zzh> Additionally, the text seems to assume/imply that there is a
>> continuous block of uSIDs. But with 40-bit uSIDs, you can only put two
>> uSIDs in an SRv6 SID? How large would the segment list be if you want to
>> encode a reasonably sized sub-tree?
>>
>> [HC2]: "assume" is used to name a variable such as "assume" that
>>
>> the size of compressed SID is u bits. Here u is a variable for the
>>
>> the size of compressed SID on average. Some example values for the
>>
>> variables are given. There seems no imply.
>>
>> There are a few of methods for compressing SIDs. The method using uSIDs
>>
>> is one of them. The size of the segment list for a sub-tree is u*N bits,
>>
>> where N is the number of links on the sub-tree.
>>
>>
>>
>> Zzh2> How about putting some uSID details in the draft as I mentioned in
>> the other email?
>>
>> [HC3]: We will add some details in the draft.
>>
>>
>> 3. If this did not involve new forwarding scheme (i.e. new hardware) one
>> could argue that multiple solutions could be developed. But given that this
>> does need new hardware and does not do better than alternatives (e.g.
>> BIER-TE), why bother.
>>
>> [HC]: It seems that both SR-P2MP and BIER-TE are involved with new
>> forwarding schemes.
>>
>> Zzh2> SR-P2MP with MPLS does NOT involve new forwarding scheme at all.
>> It’s the same as existing mLDP/RSVP-TE P2MP in the forwarding plane. For
>> SRv6 replication segment, it’s also “like” existing MPLS P2MP in that part
>> of the IPv6 address is used as lookup key just like a label (and that is
>> not different from the SRv6 VPN).
>>
>> [HC3]: It should be a new forwarding scheme for SRv6 replication segment.
>>
>> Under the network programming, replication SID is a new type of SID.
>>
>> It will have a new program/procedure for the behavior of this new SID
>>
>> even though it is “like” existing MPLS P2MP.
>>
>> Zzh2> The scheme in draft-chen is so much different - it encodes a
>> sub-tree and forwarding is certainly more complicated. If it does scale
>> well, then  it is worth it even as a new forwarding scheme, but that has
>> not been concluded.
>>
>> [HC3]: Under the network programming, the behavior of the forwarding
>>
>> is a new program/procedure, which uses existing encap with the SIDs of a
>>
>> sub-tree. It seems not that complicated.
>>
>> Zzh2> BIER-TE is a new forwarding scheme, but it is by far the best
>> per-tunnel TE solution for multicast that does not have per-tunnel state
>> inside the network, and it went through due scrutiny. Draft-chen needs to
>> be scrutinized as well, and it needs to show that it does better to proceed
>> as yet another new forwarding scheme.
>>
>> [HC3]: Whether it is the best depends on a number of factors, some of
>> them
>>
>> are explained in previous responses.
>>
>> Zzh2> Jeffrey
>>
>> Zzh> Because of new forwarding schemes, BIER went through the process of
>> forming a new WG, and BIER-TE was originally on the experimental track.
>> More importantly, my earlier point is that the new scheme should scale
>> better (at least as well) for it to be worthy.
>>
>> [HC2]: The new forwarding scheme (or say forwarding behavior) in our
>> draft
>>
>> can be defined in a small program/procedure in the network programming.
>> This
>>
>> is very different from that in BIER, where a new WG is formed.
>>
>> We will update the draft to address your concerns and make it clearer.
>>
>> Your earlier point is responded in my previous response.
>>
>>
>>
>> Zzh> Jeffrey
>>
>>
>>
>> Juniper Business Use Only
>>
>>
>>
>> Juniper Business Use Only
>>
>>
>>
>> Juniper Business Use Only
>>
>>
>>
>> Juniper Business Use Only
>>
>>
>>
>> Juniper Business Use Only
>>
>>
>>
>> Juniper Business Use Only
>>
>>
>>
>> Juniper Business Use Only
>>
>> Juniper Business Use Only
>> _______________________________________________
>> pim mailing list
>> pim@ietf.org
>> https://www.ietf.org/mailman/listinfo/pim
>>
> --
>
> <http://www.verizon.com/>
>
> *Gyan Mishra*
>
> *Network Solutions A**rchitect *
>
>
>
> *M 301 502-134713101 Columbia Pike *Silver Spring, MD
>
> --

<http://www.verizon.com/>

*Gyan Mishra*

*Network Solutions A**rchitect *



*M 301 502-134713101 Columbia Pike *Silver Spring, MD