[spring] draft-ietf-spring-segment-protection-sr-te-paths

bruno.decraene@orange.com Fri, 18 November 2022 14:12 UTC

Return-Path: <bruno.decraene@orange.com>
X-Original-To: spring@ietfa.amsl.com
Delivered-To: spring@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 78381C14CE40 for <spring@ietfa.amsl.com>; Fri, 18 Nov 2022 06:12:11 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.796
X-Spam-Level:
X-Spam-Status: No, score=-2.796 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H2=-0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=orange.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id asTZm9-3gGW3 for <spring@ietfa.amsl.com>; Fri, 18 Nov 2022 06:12:07 -0800 (PST)
Received: from relais-inet.orange.com (relais-inet.orange.com [80.12.70.34]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 677E3C14CE35 for <spring@ietf.org>; Fri, 18 Nov 2022 06:12:06 -0800 (PST)
Received: from opfednr00.francetelecom.fr (unknown [xx.xx.xx.64]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits)) (No client certificate requested) by opfednr26.francetelecom.fr (ESMTP service) with ESMTPS id 4NDJcz3Gkgz10K8; Fri, 18 Nov 2022 15:12:03 +0100 (CET)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=orange.com; s=ORANGE001; t=1668780723; bh=puWN+emKFiHYmmNF8QuraHs28tLEFH8SxxPiL6sWfCQ=; h=From:To:Subject:Date:Message-ID:Content-Type:MIME-Version; b=rOhNmvrHgJ5e7mLU9Ie2eCMlQh4uW+hXwxYpj4EvSu6k8ZgltWRvpSdBwjBkzuzEy OMuj2Hq0enjXpay4wcxR28iTlaui2EXaDLHGtNutJIOhjl8eHPeWRF/kwZdGpxfh78 gmTvMWgZ+sP/jKVTLgZepDp958T8t4Mglv8b4gwSobkSQA3+oQFtwStiPpn0ot1F5L jtS+UfKQHxGL62dB324qcWrCVdMzyuNM6UgmhK7fcyjZUcTmcbF+rdSc27Wu3Jn6sK whVderYOiOZxvmxDR5APWjm4eP5WdV8dR+WpJly3eGr/4xmXec3wE8Bdd1cRG1I6HC vUmJsUH7iqxdQ==
From: bruno.decraene@orange.com
To: SPRING WG <spring@ietf.org>, "shraddha@juniper.net" <shraddha@juniper.net>
Thread-Topic: draft-ietf-spring-segment-protection-sr-te-paths
Thread-Index: Adj7Mh2q6Ly0MX2SRXClokE4dRULuw==
Content-Class:
Date: Fri, 18 Nov 2022 14:12:02 +0000
Message-ID: <9182_1668780723_637792B3_9182_201_1_ae07b29ea09546928f3ca38a26884d33@orange.com>
Accept-Language: fr-FR, en-US
Content-Language: fr-FR
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
msip_labels: MSIP_Label_07222825-62ea-40f3-96b5-5375c07996e2_Enabled=true; MSIP_Label_07222825-62ea-40f3-96b5-5375c07996e2_SetDate=2022-11-18T14:12:00Z; MSIP_Label_07222825-62ea-40f3-96b5-5375c07996e2_Method=Privileged; MSIP_Label_07222825-62ea-40f3-96b5-5375c07996e2_Name=unrestricted_parent.2; MSIP_Label_07222825-62ea-40f3-96b5-5375c07996e2_SiteId=90c7a20a-f34b-40bf-bc48-b9253b6f5d20; MSIP_Label_07222825-62ea-40f3-96b5-5375c07996e2_ActionId=e2e5f6fb-c02d-4acf-a624-645f715b9445; MSIP_Label_07222825-62ea-40f3-96b5-5375c07996e2_ContentBits=0
x-originating-ip: [10.115.27.51]
Content-Type: multipart/alternative; boundary="_000_ae07b29ea09546928f3ca38a26884d33orangecom_"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/spring/f2953-liLbERSgW03goCvHn9gac>
Subject: [spring] draft-ietf-spring-segment-protection-sr-te-paths
X-BeenThere: spring@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Source Packet Routing in NetworkinG \(SPRING\)" <spring.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/spring>, <mailto:spring-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/spring/>
List-Post: <mailto:spring@ietf.org>
List-Help: <mailto:spring-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/spring>, <mailto:spring-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 18 Nov 2022 14:12:11 -0000

[speaking as individual contributor]

Hi  Shraddha, all,

Please find below comments on section 5 (IGP hold timer for protecting traffic to failed node)

Following my previous comments on -02, thank you for the updated text in -03.

The current solution is based on a modification of the SPF computation (or FIB installation not being in sync with the regular SPF). As such, it relies on all compliant routers to have strictly the same behavior. Otherwise this would translate into micro-loops for the duration of the hold timer (easily in the range of 10s or higher IMHO).


  1.  In order to achieve this consistent behavior on all nodes, I would welcome a more normative description of the behavior with 2119 key words.
Also since the current algo relies on the spf-back off delay in order for all nodes the compute their "hold time" SPF on the same LSDB (same set of events) I think that calls for the use of a standardize back-off delay hence a normative reference to RFC 8405.


  1.

"A small configurable
   delay called spf-delay can be enabled, which will schedule the SPF
   after the spf-delay time on receiving the first event.  In case of a
   node going down, the spf-delay time coupled with fast-flooding can
   help to accumulate link-down events reported by all neighbors in one
   single SPF.  This mechanism is on best effort basis and does not
   guarantee that all link-down events are accumulated before SPF is
   triggered.  If there are flooding delays, the SPF might get triggered
   before receiving all events related to node going down."

A few comments:
- Indeed, the  above relies/requires fast flooding. I would suggest adding an informative reference to an extension improving the current flooding speed (draft-ietf-lsr-isis-fast-flooding)
- I don't think that the proposed trick is good enough nor really works. Typically spf-delay are configured with a short initial timer to quickly react to link failure which are the most common. This would not allow to accumulate multiple events hence different nodes could use a different event/LSP before your proposed FIB hold-down and this would translate to loops for the duration of the "hold down".
- Even if the above would work, I agree that this would be only "best effort". I don't think that this is good enough given the modest gain in good cases (improved availability on the few SR-TE path using the node going down) and the cost in bad cases (micro-loops for a relative long duration (e.g. 10s) which can overload multiple links and reducing availability of all traffic, included the one prioritized by QoS)


  1.  Given the two above points, I would rather propose to use a different solution: instead of using a hold-timer, push a new node SID toward a neighbor of the failed node, typically the closest one (à la near side tunneling).
Cf draft-hegde-rtgwg-microloop-avoidance-using-spring which you wrote and which says in the abstract "Micro-looping is generally more harmful than simply dropping traffic on failed links, because it can cause control traffic to be dropped on an otherwise healthy link involved in micro-loop. This can lead to cascading adjacency failures or network meltdown."
I think that this text is also relevant in the context of draft-ietf-spring-segment-protection-sr-te-paths.

Note that instead of using the closest neighbor, one could also use the neighbor advertising the mirror SID for the failed node (as defined in RFC 8667 and 8402). The use of tunneling/pushing a SID allows for this choice to be local and hence implementation dependent.

Thank you,
Best regards,
--Bruno

> Bruno,
>
> Snipped...
>
>   1.  draft-ietf-spring-node-protection-for-sr-te-paths
>
> "If the Node-SID or Prefix-SID becomes
   > unreachable, the event and resulting forwarding changes should not
   > communicated to the forwarding planes on all configured routers
   > (including PLRs for the failed node) until the hold-timer expires."
>
>
>   *   It's not crystal clear to me how it would work in reality, so I would welcome more prescriptive text. In particular:
     > *   "node failure" is not an IGP message. IGP nodes sees multiple "adjacency loss" messages which are not atomic and could be handled in multiple SPFs. Hence different nodes will freeze their FIB based on a different topology (link1 for some, link2 for others) leading to inconsistent routing and forwarding loops.
> <SH> I will add text to capture this point and also add detailed text on possible solutions.
>
>
     > *   How is the FIB modified in cases of consecutives IGP events? (freezed on hold topology may lead to drops, updating entries would need to be specified.
> <SH> Agreed.
>
>   *   On a side node, this text requires a global behavior of all IGP nodes. That seem a bit out of scope of a non-normative sentence, in an informational document, describing a local behavior on the PLR.
> <SH> I don't think we need any protocol extension for solutions described in this document but if  WG thinks it should be a standard rather than informational
> we should upgrade this document status IMO.
>
> Rgds
> Shraddha
>
>
>
> Juniper Business Use Only
> From: spring spring-bounces@ietf.org<mailto:spring-bounces@ietf.org> On Behalf Of bruno.decraene@orange.com<mailto:bruno.decraene@orange.com>
> Sent: Wednesday, February 2, 2022 6:56 PM
> To: slitkows.ietf@gmail.com<mailto:slitkows.ietf@gmail.com>; 'SPRING WG' spring@ietf.org<mailto:spring@ietf.org>; Huzhibo huzhibo@huawei.com<mailto:huzhibo@huawei.com>
> Subject: Re: [spring] WG adoption call - draft-hu-spring-segment-routing-proxy-forwarding
>
> [External Email. Be cautious of content]
>
> Hi authors of both documents, WG,
>
> [Speaking as individual contributor.]
>
> It's good to see technical discussions on the restoration of failed SIDs used by SR policy.
>
>
>   1.  From a functional point of view, can we summarize the benefit to signal the node proxy capability?
> e.g.
> - drop the traffic earlier if the PLR does not support proxy capability. (helps with congestion)
> - use another proxy off the shortest path (increase congestion but reduce loss)
> - possibly help identifying the proxy (nominal is not in the reachable topology anymore)
> ...
> Or agree on the absence of significant benefits?
>
>
>   1.  draft-ietf-spring-node-protection-for-sr-te-paths
>
> "If the Node-SID or Prefix-SID becomes
   > unreachable, the event and resulting forwarding changes should not
   > communicated to the forwarding planes on all configured routers
   > (including PLRs for the failed node) until the hold-timer expires."
>
>
>   *   It's not crystal clear to me how it would work in reality, so I would welcome more prescriptive text. In particular:
>
     > *   "node failure" is not an IGP message. IGP nodes sees multiple "adjacency loss" messages which are not atomic and could be handled in multiple SPFs. Hence different nodes will freeze their FIB based on a different topology (link1 for some, link2 for others) leading to inconsistent routing and forwarding loops.
     > *   How is the FIB modified in cases of consecutives IGP events? (freezed on hold topology may lead to drops, updating entries would need to be specified.
>
>   *   On a side node, this text requires a global behavior of all IGP nodes. That seem a bit out of scope of a non-normative sentence, in an informational document, describing a local behavior on the PLR.
>
>
>
>
>   1.  draft-hu-spring-segment-routing-proxy-forwarding
> Rather than defining a new "Proxy Forwarding" capability in IGP why don't you use the existing Mirroring Segment (from RFC https://datatracker.ietf.org/doc/html/rfc8402#section-5.1<https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/rfc8402*section-5.1__;Iw!!NEt6yMaO-gk!TiIGZ1oWfUzj6AIX35pvSwyD9BhU_1E0xwkRheX14PjuGLhIolaoExk2oKxHlihC$>) whose signaling is already standardized? https://datatracker.ietf.org/doc/html/rfc8667#section-2.4.1<https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/html/rfc8667*section-2.4.1__;Iw!!NEt6yMaO-gk!TiIGZ1oWfUzj6AIX35pvSwyD9BhU_1E0xwkRheX14PjuGLhIolaoExk2oLCHtOYr$>
>
>
>   1.  What about the following solution:
>
>   *   Use mirror SID
  > *   Tunnel to the "proxy-forwarding" advertising mirror SID
>
> I would see the following benefits:
>
>   *   No new protocol extensions (cf "3)"
  > *   Consistent routing in case of multiple SPFs (cf "2)")
  > *   Benefit from the signaling of the proxy (cf "1)")
>
> Thanks,
> Regards,
> --Bruno
>
>
>
> Orange Restricted
> From: slitkows.ietf@gmail.com<mailto:slitkows.ietf@gmail.com<mailto:slitkows.ietf@gmail.com%3cmailto:slitkows.ietf@gmail.com>> slitkows.ietf@gmail.com<mailto:slitkows.ietf@gmail.com<mailto:slitkows.ietf@gmail.com%3cmailto:slitkows.ietf@gmail.com>>
> Sent: Tuesday, January 25, 2022 6:13 PM
> To: DECRAENE Bruno INNOV/NET bruno.decraene@orange.com<mailto:bruno.decraene@orange.com<mailto:bruno.decraene@orange.com%3cmailto:bruno.decraene@orange.com>>; 'SPRING WG' spring@ietf.org<mailto:spring@ietf.org<mailto:spring@ietf.org%3cmailto:spring@ietf.org>>
> Subject: RE: [spring] WG adoption call - draft-hu-spring-segment-routing-proxy-forwarding
>
> Hi,
>
> I'm NOT supporting this draft for the following reasons:
>
>
>   1.  The WG already have a WG document which is dealing with this problem, I don't think that WG should come with multiple documents/solutions for the same solution space as it may just confuse the industry and create deployment issues as different vendors may pick different solutions.
>
>
>
>   1.  Adding protocols extensions adds complexity in the solution without adding a strong value.
>
>
>
> The document claims that "[I-D.ietf-spring-segment-protection-sr-te-paths] ... may not work for some cases such as some of nodes in the network not supporting this solution.". While this is true, the proposed solution in draft-hu-spring-segment-routing-proxy-forwarding has exactly the same caveat and requires all nodes in the network to support the solution.
>
>
>
> Considering the following straight line network: A -B -C -D - E - F - G -H and an SR policy from A to H using SID_G, routers A to F have to support the extension to make the solution working, if one of the router doesn't support the extension, traffic will be dropped.
>
>
>
> Then, there is no value compared to the timer-based solution of [I-D.ietf-spring-segment-protection-sr-te-paths]
>
>
>
> Authors of draft-hu-spring-segment-routing-proxy-forwarding argued that G may have multiple upstream neighbors let's say F and F' and the solution allows for F' to support the extension while F may not support, so the solution will send the traffic to F'. Well yes, but this still requires all routers upstream to F' to support this extension and maybe F is on the path to F'. So, I don't think the argument is valid as it may possibly work tactically depending on the network topology when we look at a small portion of the network, but when we look at the whole network, operator will have to upgrade all their nodes to support the extension to ensure the benefit is there.
>
>
>
> In addition, in term of traffic, forwarding traffic to a neighbor of the failed node which wasn't initially on the path, could lead to traffic congestion or high traffic peaks on links that were not sized to carry this traffic. We could easily expect some traffic tromboning, where traffic goes to this non-natural neighbor of the failed node and then goes back over some part of the same path before reaching the destination.
>
>
>
> So these protocol extensions are bringing complexity for no value here.
>
>
>
>
>   1.  Regarding BSID, I'm not fan of advertising BSIDs in IGP as there may be hundreds or thousands of BSID on a node which again will create a lot of burden in IGP. The proposed way will have to be discussed in LSR, not in SPRING (see next comment).
>
>
> Note that [I-D.ietf-spring-segment-protection-sr-te-paths] could also work with BSIDs as long as BSID information of failed node is available in the control-plane of PLRs by whatever mechanism. I think this BSID handling is orthogonal to the proxy-forwarding controlplane behavior. The forwarding operations for BSID will have to be discussed more in details, we could not expect all HW to be able to do 3 or 4 lookups without any perf degradation.
>
>
>
>   1.  The document is currently a bit borderline between SPRING and LSR as it talks in good details about IGP protocol extensions. If it's a SPRING doc, it should detail reqs for protocols but nothing beyond.
>
>
>
> Brgds,
>
> Stephane
>
>
> From: spring spring-bounces@ietf.org<mailto:spring-bounces@ietf.org<mailto:spring-bounces@ietf.org%3cmailto:spring-bounces@ietf.org>> On Behalf Of bruno.decraene@orange.com<mailto:bruno.decraene@orange.com<mailto:bruno.decraene@orange.com%3cmailto:bruno.decraene@orange.com>>
> Sent: jeudi 13 janvier 2022 11:19
> To: SPRING WG spring@ietf.org<mailto:spring@ietf.org<mailto:spring@ietf.org%3cmailto:spring@ietf.org>>
> Subject: [spring] WG adoption call - draft-hu-spring-segment-routing-proxy-forwarding
>
> Dear WG,
>
> This message starts a 2 week WG adoption call, ending 27/01/2022, for draft-hu-spring-segment-routing-proxy-forwarding
> https://datatracker.ietf.org/doc/draft-hu-spring-segment-routing-proxy-forwarding/<https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/draft-hu-spring-segment-routing-proxy-forwarding/__;!!NEt6yMaO-gk!TiIGZ1oWfUzj6AIX35pvSwyD9BhU_1E0xwkRheX14PjuGLhIolaoExk2oKTgLzsj$<https://datatracker.ietf.org/doc/draft-hu-spring-segment-routing-proxy-forwarding/%3chttps:/urldefense.com/v3/__https:/datatracker.ietf.org/doc/draft-hu-spring-segment-routing-proxy-forwarding/__;!!NEt6yMaO-gk!TiIGZ1oWfUzj6AIX35pvSwyD9BhU_1E0xwkRheX14PjuGLhIolaoExk2oKTgLzsj$>>
>
> After review of the document please indicate support (or not) for WG adoption of the document to the mailing list.
>
> Please also provide comments/reasons for your support (or lack thereof) as this is a stronger way to indicate your (non) support as this is not a vote.
>
> If you are willing to work on or review the document, please state this explicitly. This gives the chairs an indication of the energy level of people in the working group willing to work on the document.
>
> Thanks!
Bruno, Jim, Joel

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.