[spring] Re: draft-ietf-spring-resource-aware-segments-17 ietf last call Opsdir review

"Dongjie (Jimmy)" <jie.dong@huawei.com> Fri, 15 May 2026 03:15 UTC

Return-Path: <jie.dong@huawei.com>
X-Original-To: spring@mail2.ietf.org
Delivered-To: spring@mail2.ietf.org
Received: from localhost (localhost [127.0.0.1]) by mail2.ietf.org (Postfix) with ESMTP id 0565EEE9DB11; Thu, 14 May 2026 20:15:01 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=ietf.org; s=ietf1; t=1778814901; bh=lnyC8bCQ9l/WbF/trC1MQgAtiyPt5z6an8zNNZiU4uw=; h=From:To:CC:Subject:Date:References:In-Reply-To; b=CO5ZIc9OrQbcdvZXPOPcTMbVvK1NFtymyumz8l8sam3nx7Nc/HbqLFXNIRVA2kI7q 6LJQVQAnKoewKyHGXt5AQ1y6jlwpO6+/rwDmn13QcFdODCmDdb0CUjrbF29yA8+Zf0 csEAywpU/zqmpHUvvPXdGI8hkG4+6Qa8wQnj4ldU=
X-Virus-Scanned: amavisd-new at ietf.org
X-Spam-Flag: NO
X-Spam-Score: -4.196
X-Spam-Level:
X-Spam-Status: No, score=-4.196 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, RCVD_IN_VALIDITY_CERTIFIED_BLOCKED=0.001, RCVD_IN_VALIDITY_RPBL_BLOCKED=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001] autolearn=ham autolearn_force=no
Received: from mail2.ietf.org ([166.84.6.31]) by localhost (mail2.ietf.org [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tlKX-FndCLHE; Thu, 14 May 2026 20:14:59 -0700 (PDT)
Received: from frasgout.his.huawei.com (frasgout.his.huawei.com [185.176.79.56]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by mail2.ietf.org (Postfix) with ESMTPS id 92284EE9DB08; Thu, 14 May 2026 20:14:59 -0700 (PDT)
Received: from mail.maildlp.com (unknown [172.18.224.83]) by frasgout.his.huawei.com (SkyGuard) with ESMTPS id 4gGskX6vf8zHnGf4; Fri, 15 May 2026 11:14:44 +0800 (CST)
Received: from kwepemh500011.china.huawei.com (unknown [7.202.181.142]) by mail.maildlp.com (Postfix) with ESMTPS id C618540577; Fri, 15 May 2026 11:14:57 +0800 (CST)
Received: from dggpemf100007.china.huawei.com (7.185.36.214) by kwepemh500011.china.huawei.com (7.202.181.142) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Fri, 15 May 2026 11:14:52 +0800
Received: from kwepemf100006.china.huawei.com (7.202.181.220) by dggpemf100007.china.huawei.com (7.185.36.214) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1544.11; Fri, 15 May 2026 11:14:52 +0800
Received: from kwepemf100006.china.huawei.com ([7.202.181.220]) by kwepemf100006.china.huawei.com ([7.202.181.220]) with mapi id 15.02.1544.036; Fri, 15 May 2026 11:14:52 +0800
From: "Dongjie (Jimmy)" <jie.dong@huawei.com>
To: Fung Lim <flim@cisco.com>, "ops-dir@ietf.org" <ops-dir@ietf.org>
Thread-Topic: draft-ietf-spring-resource-aware-segments-17 ietf last call Opsdir review
Thread-Index: AQHc4DXOohhZ8fSBbUSR6mla6MfRDLYIbXHggAX4l8A=
Date: Fri, 15 May 2026 03:14:52 +0000
Message-ID: <d1d1708d1a584825a73add5e29a66b3c@huawei.com>
References: <177838744379.414360.11957866318395480847@dt-datatracker-54557f87b8-lnrkh> <1167fee1faee4db7ba68a40b94d8ed92@huawei.com>
In-Reply-To: <1167fee1faee4db7ba68a40b94d8ed92@huawei.com>
Accept-Language: en-US, zh-CN
Content-Language: zh-CN
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.112.40.122]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Message-ID-Hash: 6DZ54KFJNQTG7XBRRX3UMNXZ4DMRRDTS
X-Message-ID-Hash: 6DZ54KFJNQTG7XBRRX3UMNXZ4DMRRDTS
X-MailFrom: jie.dong@huawei.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-spring.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: "draft-ietf-spring-resource-aware-segments.all@ietf.org" <draft-ietf-spring-resource-aware-segments.all@ietf.org>, "last-call@ietf.org" <last-call@ietf.org>, "spring@ietf.org" <spring@ietf.org>
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [spring] Re: draft-ietf-spring-resource-aware-segments-17 ietf last call Opsdir review
List-Id: "Source Packet Routing in NetworkinG (SPRING)" <spring.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/spring/x9HV0IGLQC2HweS6f56HchnyYh0>
List-Archive: <https://mailarchive.ietf.org/arch/browse/spring>
List-Help: <mailto:spring-request@ietf.org?subject=help>
List-Owner: <mailto:spring-owner@ietf.org>
List-Post: <mailto:spring@ietf.org>
List-Subscribe: <mailto:spring-join@ietf.org>
List-Unsubscribe: <mailto:spring-leave@ietf.org>

Hi Fung,

Thanks again for your review of this draft. We've posted a new revision which incorporated all your comments and suggestions on the security considerations: https://datatracker.ietf.org/doc/html/draft-ietf-spring-resource-aware-segments-18

Please also find replies to some of the comments inline, and let us know if these comments have been addressed. 

Best regards,
Jie (on behalf of coauthors)

-----Original Message-----
From: Fung Lim via Datatracker <noreply@ietf.org> 
Sent: Sunday, May 10, 2026 12:31 PM
To: ops-dir@ietf.org
Cc: draft-ietf-spring-resource-aware-segments.all@ietf.org; last-call@ietf.org; spring@ietf.org
Subject: draft-ietf-spring-resource-aware-segments-17 ietf last call Opsdir review

Document: draft-ietf-spring-resource-aware-segments
Title: Introducing Resource Awareness to SR Segments
Reviewer: Fung Lim
Review result: Has Issues


1. Operational Considerations section needs expansion

No guidance on resource group sizing or planning. The document acknowledges scalability concerns (Section 2.1: "there can be scalability concerns when the number of resource groups is large") but the Operational Considerations section provides no guidance on practical limits or planning heuristics. Providing guidance would better align intended use cases.

[Jie] Some guidance about the resource group size and planning have been provided in the operational considerations section. 


The document should describe what happens when resource allocation fails or is inconsistent. Section 3 states resource group support "MUST be aligned among the network nodes," but what happens operationally when it is not? How does an operator detect misalignment? What are the failure symptoms, or failure modes?

[Jie] Some text about the failure in resource allocation have been added to the control plane considerations. And text about the inconsistency in the binding have been added to in the operational considerations section.


There is also no discussion of resource over-subscription. What happens when traffic exceeds the allocated resources for a resource-aware SID? Is traffic dropped, downgraded, or does it spill into other resource groups? This is a critical operational question left entirely unaddressed and guidance is necessary for consistent implementation.

[Jie] The behavior for over-subscription on a "virtual topology" built with a resource group is no different from over-subscription in existing networks. That said, some text about the operators policy on the treatment of traffic exceeding the allocated resources have been added. 


2. Missing configuration management guidance

The document describes two provisioning approaches (local configuration vs.
centralized controller) but provides no guidance on:

- What configuration parameters exist and what are their defaults?

[Jie] Some text have been added to the control plane considerations section about the resource-group and resource-aware SID provisioning.

- Are there any consideration for validation before activation, or after device reboots? - What state must be preserved across device reboots? - How to handle configuration rollback if resource allocation partially fails across a multi-node resource group?

[Jie] IMO no consideration is needed for activation or device reboot. The rollback in case of partial resource allocation failure have been added to the control plane considerations. 



3. No fault management or troubleshooting guidance

The document introduces several new failure modes but does not discuss how operators would detect or diagnose them:

- A node fails to allocate resources to a resource-aware SID
- Resource group alignment becomes inconsistent across nodes
- Resource exhaustion on a subset of links in a resource group
- SLA violations caused by insufficient resource allocation

[Jie] As mentioned above, text related to resource allocation failure and inconsistent binding have been added to this version. 
Resource exhaustion and SLA violation are discussed in the security considerations. 


Nits

4. Section 3 mentions that "in network cases with SR and other TE mechanisms (such as RSVP-TE) co-existing," IGP advertisements "may need to be updated" and "it is suggested such updates would be rate-limited.". It lacks specifics — what rate limiting is appropriate? What are the consequences of not rate-limiting?

[Jie] Some references to control plane rate-limiting and suppression mechanisms are added.


5. No discussion of management interoperability

The document references NETCONF/YANG and draft-ietf-spring-sr-policy-yang as controller interfaces, but does not discuss:

Whether a YANG Data Model for resource-aware segments is needed How resource group state would be exposed through management interfaces Whether existing SR YANG models are sufficient or need extension

[Jie] Some more text about YANG model augmentation and a reference to a related YANG model are added. 


6. PHP recommendation needs operational justification

Section 2.1 states: "it is RECOMMENDED that Penultimate Hop Popping (PHP) be disabled." Disabling PHP is a significant operational change for many SR-MPLS deployments. The document should discuss:

The operational impact of disabling PHP (e.g., increased label stack processing on egress) How to verify that PHP is correctly disabled across relevant paths What happens if PHP is not disabled — is there a graceful degradation or a hard failure?

[Jie] The mechanism and benefits/costs of PHP are adequately covered in multiple RFCs, which will not be repeated in this draft. We added some text about the cost of not disabling PHP in the case specified in this draft. 


7. Typos in Operational Considerations section

"Althougth" → "Although" (line 578)
"paradigmn" → "paradigm" (line 578)

[Jie] These typos have been fixed, thanks. 


8. Single-vendor implementation status

Section 5 lists only Huawei implementations. While this is noted as per SPRING WG policies, from an operational perspective, single-vendor implementation raises interoperability concerns for a Standards Track document.

[Jie] As the implementation status section says: 
"This is not intended as, and must not be construed to be, a catalog of available implementations or their features. Readers are advised to note that other implementations may exist."
AFAIK there could implementations which are not reported to the WG yet. 


I hope these comments are useful and constructive! The core mechanism is well-conceived; strengthening the operational guidance will improve its deployability.

[Jie] Yes these comments are very helpful, thanks again for the review and suggestions. 

Bes regards,
Jie


Fung Lim