Re: [mpls] Concerns about ISD

Tianran Zhou <zhoutianran@huawei.com> Sat, 16 April 2022 07:58 UTC

Return-Path: <zhoutianran@huawei.com>
X-Original-To: mpls@ietfa.amsl.com
Delivered-To: mpls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A4C5F3A19B8 for <mpls@ietfa.amsl.com>; Sat, 16 Apr 2022 00:58:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.905
X-Spam-Level:
X-Spam-Status: No, score=-1.905 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id X46Gkspi1T0T for <mpls@ietfa.amsl.com>; Sat, 16 Apr 2022 00:58:23 -0700 (PDT)
Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 5D52B3A19B7 for <mpls@ietf.org>; Sat, 16 Apr 2022 00:58:22 -0700 (PDT)
Received: from fraeml710-chm.china.huawei.com (unknown [172.18.147.200]) by frasgout.his.huawei.com (SkyGuard) with ESMTP id 4KgQVr5LfPz67F4F for <mpls@ietf.org>; Sat, 16 Apr 2022 15:56:04 +0800 (CST)
Received: from kwepemi100010.china.huawei.com (7.221.188.54) by fraeml710-chm.china.huawei.com (10.206.15.59) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 16 Apr 2022 09:58:18 +0200
Received: from kwepemi500009.china.huawei.com (7.221.188.199) by kwepemi100010.china.huawei.com (7.221.188.54) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2375.24; Sat, 16 Apr 2022 15:58:17 +0800
Received: from kwepemi500009.china.huawei.com ([7.221.188.199]) by kwepemi500009.china.huawei.com ([7.221.188.199]) with mapi id 15.01.2375.024; Sat, 16 Apr 2022 15:58:17 +0800
From: Tianran Zhou <zhoutianran@huawei.com>
To: Tony Li <tony.li@tony.li>, Haoyu Song <haoyu.song@futurewei.com>
CC: John E Drake <jdrake@juniper.net>, "mpls@ietf.org" <mpls@ietf.org>
Thread-Topic: [mpls] Concerns about ISD
Thread-Index: AdhKc4fdvDv9lzMNTfy5c++8iNI9i///poSA//7GfeCAAt4vgP/7wPiAgAhY2QD//2GkAAAn72wA//7TxPD//eE1gP/7NOOg//aGzID/67yBQP/XK+UA/6x794D/V7AIAP6vG0OA/V4yzID6vGKOAPV4wUgA6vF9xYDV4vMTgKvF3LWA14uv0wCvFxZWAN4uHVMAvFueVMA=
Date: Sat, 16 Apr 2022 07:58:17 +0000
Message-ID: <98ea63e09efc4fb9a7e7351de58d8f6a@huawei.com>
References: <6cc272447d2f4c779e85d5c42d3b3c6c@huawei.com> <8623637D-A32E-47A4-B5FC-4D2CF40BEDD1@tony.li> <6199e0e886f9437c95ef9b70719b00ec@huawei.com> <BCFD3F4A-36D6-47C2-B907-FC40B402F97C@tony.li> <3fb1f261ddff48deb0c2ea083cdbd16f@huawei.com> <6B96F21B-9331-4FA8-AD7B-84A4CA8B6FAB@tony.li> <903c57a48280454091495673ec2fe275@huawei.com> <BD5C1BE7-4633-4B51-BAC1-B2AE1C537F36@tony.li> <ad6b8c42b0aa4880b9dee02516f5e46f@huawei.com> <F5BB2CEB-CC8C-4E71-A2E7-B4212878C3B1@tony.li> <aa9c4b913d844410b2af90c8db78c194@huawei.com> <BY3PR05MB8081937B52E657713E8293BFC7ED9@BY3PR05MB8081.namprd05.prod.outlook.com> <a29c96be774845e582a66700d2264f7b@huawei.com> <BY3PR05MB8081870EF67C551727BBE2CFC7EC9@BY3PR05MB8081.namprd05.prod.outlook.com> <d5521b3972dd43e38276afbbdc7c2bda@huawei.com> <BY3PR05MB80813C7CAD7F2C12C36FB513C7EE9@BY3PR05MB8081.namprd05.prod.outlook.com> <BY3PR13MB47879EB8A582437DE936688C9AEE9@BY3PR13MB4787.namprd13.prod.outlook.com> <C493D0B8-4B57-4D19-BC27-70ABD7F50356@tony.li> <BY3PR13MB47878B227A37AAA06625194B9AEE9@BY3PR13MB4787.namprd13.prod.outlook.com> <0318B3A3-2884-4FD6-B5EF-377481D2657B@tony.li> <BY3PR13MB4787752FB6D147281A7150789AEE9@BY3PR13MB4787.namprd13.prod.outlook.com> <602D6128-3BE3-4A2D-B5C2-019AE0FADF09@tony.li> <BY3PR13MB47876188B5927A51BD4F4E739AEE9@BY3PR13MB4787.namprd13.prod.outlook.com> <BCB99042-ECA3-40C6-8581-FA1656DDF987@tony.li> <BY3PR13MB4787468DAA96610B9933E1659AF19@BY3PR13MB4787.namprd13.prod.outlook.com> <EB04096F-70B7-4FF0-973F-6C7C1FDDE837@tony.li>
In-Reply-To: <EB04096F-70B7-4FF0-973F-6C7C1FDDE837@tony.li>
Accept-Language: zh-CN, en-US
Content-Language: zh-CN
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.112.40.195]
Content-Type: multipart/alternative; boundary="_000_98ea63e09efc4fb9a7e7351de58d8f6ahuaweicom_"
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Archived-At: <https://mailarchive.ietf.org/arch/msg/mpls/gEqgdACJerhDKSPEdjkL_FmWBu0>
Subject: Re: [mpls] Concerns about ISD
X-BeenThere: mpls@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Multi-Protocol Label Switching WG <mpls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mpls>, <mailto:mpls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mpls/>
List-Post: <mailto:mpls@ietf.org>
List-Help: <mailto:mpls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mpls>, <mailto:mpls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Sat, 16 Apr 2022 07:58:30 -0000

It’s better if we can put all the solutions in the plate. Use the same use cases, scenarios, and criteria to compare.
And to see, how ISD can achieve, and how ISD could be better than PSD.

Tianran

From: Tony Li [mailto:tony1athome@gmail.com] On Behalf Of Tony Li
Sent: Saturday, April 16, 2022 9:22 AM
To: Haoyu Song <haoyu.song@futurewei.com>
Cc: John E Drake <jdrake@juniper.net>; Tianran Zhou <zhoutianran@huawei.com>; mpls@ietf.org
Subject: Re: [mpls] Concerns about ISD


Hi Haoyu,

You’re assuming a TCAM and your only metric here seems to be the storage cost.  Many folks don’t do parsing using a TCAM and the performance cost is in the reads, not in the storage.

[HS2] TCAM is actually the better choice to store states involving bitmaps. Otherwise, parsing bitmap requires extra calculations in addition to storage, making it even less favorable.


Ok, I disagree. I would MUCH rather have normal ALU functionality.



[HS2] We propose to use explicit FEC to tell there’s no EHs in the packet to avoid the label stack scanning in such cases. This doesn’t nullify the need to carry extra metadata for some user packets to support some in-network functions.


I will wait to see a concrete complete proposal before commenting.





For the EH encoding efficiency, I appreciate your input. Currently it follows the IPv6 EH type (i.e., using NH + LENGTH to delineate every EH). There could certainly be room for improvement.


Ok.  If I understand your proposal correctly, there are 4 octets of overhead (HEH) at the start. Then, for each network action, there would seem to be at least 3 octets of overhead, plus any associated data, plus alignment.

Let’s suppose that we want to encode NFFRR, entropy, and GISS in one packet.  By my math, the cost is:

EHI: 4 octets
HEH: 4 octets
NFFRR: 4 octets
Entropy: 8 octets
GISS: 8 octets

Total: 28 octets

Do I have that right?

[HS] It’s right if you think 8 octets are needed for Entropy and GISS


If I understand your proposal, and we want >8 bits for both of these fields, then each would require 3 octets of overhead. Storing the EL/GIS value might take 2 or 3 octets.  Alignment would take you to 8.

If I look at the same thing with the FAI draft, I get:

FAI: 4 octets
NFFRR: included in the above, so 0 octets
Entropy: 4 octets (30 bits, plus overhead)
GISS: 4 octets (30 bits, plus overhead)

Total: 12 octets

[HS2] Header overhead is an important dimension for comparison.  To make it apple to apple, let’s first clarify several points.

  1.  If an action doesn’t need any extra data, there’s no point to allocate an EH for it. A flag in EHI is sufficient (this applies to the NFFRR case).


Ok, that saves you 4 octets, but costs you a bit in EHI. There’s only a finite number of bits available.  What do you do when they’re consumed?




  1.
  2.  You are using the bitmap encoding since you mention FAI. Bitmap is more compact than Type/Length for sure. Here the implication is that each ISD must be 4 octets long, otherwise it will need every node to understand the size of every data item, further complicating the design.


In FAI, the length of the ISD is defined based on the action and may have different lengths depending on bit combinations. Please see the EG bits, for example.

Yes, determining the length requires reading the full ISD.




  1.
  2.  For extensibility, how many ISD data items is planned to be supported? If it exceeds the one label capacity, you will need to extend it. So the actual overhead of FAI could be 8 octets instead of 4.


Absolutely true. Allocate popular functions first. :) If there are too many actions, then we could also consider a second SPL.




  1.
Let’s temporarily put all these details aside and assume the minimum sized FAI. Basically, EH has a fixed 4 byte overhead for HEH. For each use case that does need metadata, EH will add 4 more byte overhead. This is the header overhead comparison. (In your example, it’s 24 bytes PSD vs. 12 bytes ISD)



So EH is twice as expensive as FAI.



Now let’s look at the other dimensions.

  1.  We know some use cases requiring data too big to fit in stack, so PSD is needed anyway. PSD can be used for the use cases with smaller metadata size, although it’s relatively inefficient as discussed above. So essentially we can have just one mechanism all everything but to support ISD we end up with two.


That’s true.  We’re trading added complexity for better performance.




  1.
  2.  Bitmap, albeit succinct, is inflexible and less extensible. The semantics and order are fixed at the design time; the total size of each data item must be equal to a label size to make bitmap useful in finding the corresponding data item.


That’s incorrect. The size can vary. An implementation must support the NAI that are set.  Handling cases where a node only knows some of them still needs to be discussed, but I favor just not supporting that case at all.




  1.  Use cases identified in the future can only use unused bitmap bits, regardless of its importance. If a use case requires to extend its data size, it’s out of luck; if some common use cases also apply to the other types of networks (especially for those still under investigation), the use cases can be seriously limited by such size constraint which hampers design sharing and interoperability.


Yes, future actions can only allocate unused bitmap bits.  Reusing existing bitmap bits would overlap with existing functions.  Your statement seems to be a non-sequitur.

A new action may define whatever length of ISD it likes, including being variable length. It’s not recommended, but it is possible.




  1.
  2.  Parsing based on bitmap significantly bloats either the parser size or the parsing latency or both (as shown in my analysis). People can write pseudocode by themselves to verify. This is bad for both software and hardware data plane implementation. No wonder bitmap is never used for header encoding before (it’s sometimes used in a header for its sub-structure, which is related to the header processing but not the header parsing).


Only if you insist on using a TCAM for parsing. From my perspective, that’s the exactly wrong tool for the job. If you use conventional ALU, then parsing become a straightforward ‘if’ chain.

We are not obligated to optimize the solution for the wrong tools.



  1.
  2.  The encoding of each ISD item itself is awkward. You have 30 bits and only 30 bits at your disposal with a BoS bit in between. It creates a hole in data, which is bad for both software and hardware.


We already discussed this. Yes, it’s not optimal, but it’s only a few instructions to fix, if necessary.

(BTW, I wonder why we want to redefine entropy. We have already have a standard for it and we don’t need to do anything more about it ).


It’s true that we have a standard for it. It’s fairly common, so then the interesting question is what happens when it is present in the label stack along with MNA?  If we do not incorporate EL into MNA, then an implementation has to deal with both MNA and EL independently.  What order are they in? How do they interact? If we use the standard ELI/EL encoding, that’s 8 octets just for entropy.  OTOH, if we incorporate EL into MNA, then when using both, we can collapse the encoding.  For example, in the example above, entropy is 4 octets, resulting in a savings of 4 octets.  Half price! And that’s assuming that you don’t use the smaller entropy/GISS encodings which would give an even greater savings.

[HS2] I think the new design won’t nullify the EL/ELI standard so they need to coexist in an incremental deployment scenario given EL/ELI might have been realized (It’s possible to provide yet another entropy solution using the new mechanism but the necessity is subject to further discussion IMO). But they don’t need to interact with each other because each has its clear function and operational procedure.



Well, you could do things that way, but IMHO, that’s suboptimal.  If you’re including MNA, then you’re already limting yourself to MNA capable nodes. You might as well take the benefit of the better encoding.

Tony