RE: [EXTERNAL] draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment

"Voyer, Daniel" <daniel.voyer@bell.ca> Tue, 07 November 2023 09:39 UTC

Return-Path: <prvs=6684c8616=daniel.voyer@bell.ca>
X-Original-To: rtgwg@ietfa.amsl.com
Delivered-To: rtgwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 04114C1CAFF5; Tue, 7 Nov 2023 01:39:53 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.303
X-Spam-Level:
X-Spam-Status: No, score=-2.303 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, GB_ABOUTYOU=0.5, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_NONE=0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=bell.ca
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id LouZq4TulmZE; Tue, 7 Nov 2023 01:39:48 -0800 (PST)
Received: from ESA2-Wyn.bell.ca (esa2-wyn.bell.ca [67.69.243.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B682DC1D2D61; Tue, 7 Nov 2023 01:39:47 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bell.ca; i=@bell.ca; q=dns/txt; s=ESAcorp; t=1699349988; x=1730885988; h=from:to:cc:date:message-id:references:in-reply-to: mime-version:subject; bh=R4aexcSHR3Cb+NHYeQmZb/J0v742v3E5BWz6qdeI01A=; b=aE0yYsiGBSejP0YVnTOsDixis01icW6oomGq3O2ib/Baqj+uGOs08pp2 tTZqU/kTP001s/pZrpVVNl1bcw+KdMnUmUX5OYBv1Gwn0OEwtmOSMC066 X5dTk38ezTx0vYi/iTtA/3rlW05jSNERcqUJwUDd2oAYs1nouhWdQfmQs /83BRhod9VYjVfnUfIKgbCZTrqDvXADywmfURpbZ6cNvdjBY76J6mVqIb KIjjVnss4GOwePjsFx9GGKitgBzos3Hi5z9KKVfovPPifj2wQJOu9MG/9 tGSULG5fn6R2tOK8ZDWjKMc1HNw0MlK7B1gZevOH9xuW8pkWPWwW5w/Mc A==;
Subject: RE: [EXTERNAL] draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment
Received: from dm5cch-d01.bellca.int.bell.ca (HELO DG4MBX03-WYN.bell.corp.bce.ca) ([198.235.102.31]) by esa02corp-wyn.bell.corp.bce.ca with ESMTP; 07 Nov 2023 04:39:46 -0500
Received: from DG4MBX01-WYN.bell.corp.bce.ca (142.182.18.27) by DG4MBX03-WYN.bell.corp.bce.ca (142.182.18.29) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.27; Tue, 7 Nov 2023 04:39:46 -0500
Received: from DG4MBX01-WYN.bell.corp.bce.ca ([fe80::9d79:eeda:2c4:e2e1]) by DG4MBX01-WYN.bell.corp.bce.ca ([fe80::9d79:eeda:2c4:e2e1%4]) with mapi id 15.01.2507.027; Tue, 7 Nov 2023 04:39:46 -0500
From: "Voyer, Daniel" <daniel.voyer@bell.ca>
To: Ahmed Bashandy <abashandy.ietf@gmail.com>, Yingzhen Qu <yingzhen.ietf@gmail.com>
CC: "rtgwg@ietf.org" <rtgwg@ietf.org>, "draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org" <draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org>, rtgwg-chairs <rtgwg-chairs@ietf.org>, Alexander Vainshtein <Alexander.Vainshtein@rbbn.com>
Thread-Topic: [EXT]Re: [EXTERNAL] draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment
Thread-Index: AQHaDWqKN06fG8BBYE+7dvSDqWmsL7BnJX0AgAACZICAB2eAX4AAVkCAgAANgYCAAADFAIAAAXCAgAARF4A=
Date: Tue, 07 Nov 2023 09:39:46 +0000
Message-ID: <92259905-8D73-4689-AF6D-623D16CC6ED9@bell.ca>
References: <9908D9F3-45C6-497D-B3BF-84D8A68A5013@gmail.com> <CABNhwV30uhLOo52WHAv6YS4Wg0k9gDbkrs1ANuGPPdLzc1=dsw@mail.gmail.com> <6A2E595E-A7E6-4976-ACC9-E75402AD99E2@gmail.com> <PH0PR03MB63005F751BF04E8D5BEC982FF6A6A@PH0PR03MB6300.namprd03.prod.outlook.com> <A5218ED2-479C-48B5-8AC8-DA6B247D6665@gmail.com> <PH0PR03MB63000BC8F43B90B0CA1A1543F6A6A@PH0PR03MB6300.namprd03.prod.outlook.com> <E02A044F-4431-4559-97A8-C6B810DD7E4D@gmail.com> <AF2B1C41-55F6-4E78-AA4B-0AE7F573820B@gmail.com> <PH0PR03MB63002291B0F1F5514875018EF6A6A@PH0PR03MB6300.namprd03.prod.outlook.com> <2A85DD24-612D-472D-907D-1D90C88A95AD@gmail.com> <CABY-gOMVMb+TWLoKBdurn7jL=xF6APeVhEoSzSbH5CGnvmpLHw@mail.gmail.com> <e9be92cc-fd1c-f5da-be61-74d9dfe793da@gmail.com> <CABY-gOOTzUAUJJ3iTNusVCbJNjoKckViFq8bDe8ZkoOEN32FTw@mail.gmail.com> <bcf5a242-5129-b33a-a602-5cf1f88f1d3b@gmail.com> <CABY-gOOviqm1cDT-oJTgUVz5QGmpEmhsgLW=GJDzRzx+HXZiQw@mail.gmail.com> <1d3524ad-d53b-665d-c2c2-0b7f9eaa59af@gmail.com>
In-Reply-To: <1d3524ad-d53b-665d-c2c2-0b7f9eaa59af@gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/16.78.23102801
x-originating-ip: [172.24.112.65]
Content-Type: multipart/alternative; boundary="_000_922599058D734689AF6D623D16CC6ED9bellca_"
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtgwg/9BBN38spNVcrqfpocsbRS9mPPDY>
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Routing Area Working Group <rtgwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtgwg/>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 07 Nov 2023 09:39:53 -0000

Yes Ahmed and thanks for Yingzhen for correcting my copy/paste – the date is Nov 8th

From: rtgwg <rtgwg-bounces@ietf.org> on behalf of Ahmed Bashandy <abashandy.ietf@gmail.com>
Date: Tuesday, November 7, 2023 at 10:38 AM
To: Yingzhen Qu <yingzhen.ietf@gmail.com>
Cc: "rtgwg@ietf.org" <rtgwg@ietf.org>, "draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org" <draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org>, rtgwg-chairs <rtgwg-chairs@ietf.org>, Alexander Vainshtein <Alexander.Vainshtein@rbbn.com>
Subject: [EXT]Re: [EXTERNAL] draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment


17:00-18:00 Prague time, correct?

Ahmed


On 11/7/23 1:33 AM, Yingzhen Qu wrote:
Wednesday (11/8) 17:00-18:00, Location: Palmovka 1/2
Webex Link:
https://ietf.webex.com/meet/sidemeetingietf2

On Tue, Nov 7, 2023 at 1:30 AM Ahmed Bashandy <abashandy.ietf@gmail.com<mailto:abashandy.ietf@gmail.com>> wrote:

I looked through the various replies and I was not able to find the time slot or the webex link

But I am assuming it will be shortly after Session I on Wednesday. This way we do not miss Session II, or at least only miss the first few minutes



Ahmed


On 11/7/23 12:42 AM, Yingzhen Qu wrote:
Hi Ahmed,

We'll have webex link, so it's about your availability.

Thanks,
Yingzhen

On Tue, Nov 7, 2023 at 12:33 AM Ahmed Bashandy <abashandy.ietf@gmail.com<mailto:abashandy.ietf@gmail.com>> wrote:

I had to convert my attendance to remote due to family issues late last week. So I am not onsite



Ahmed


On 11/2/23 9:34 AM, Yingzhen Qu wrote:
Hi,

The ti-lfa draft has not done WGLC yet, and we should definitely try to resolve this issue.

I just checked the IETF 118 attendees list, and it seems not everyone will be onsite. I'd suggest continuing the discussion using this thread, and we can schedule either a side meeting during 118 or an Interim meeting on this topic after 118. Authors from both the ti-lfa and sr-uloop, Stewart, Sasha, and Gyan should be there.

Please reply with your thoughts or email the chairs directly.

Thanks,
Yingzhen

On Thu, Nov 2, 2023 at 8:45 AM Stewart Bryant <stewart.bryant@gmail.com<mailto:stewart.bryant@gmail.com>> wrote:
Sasha, please see inline


On 2 Nov 2023, at 14:12, Alexander Vainshtein <Alexander.Vainshtein@rbbn.com<mailto:Alexander.Vainshtein@rbbn.com>> wrote:

Stewart and all,
I think I understand now the difference between Section 6.2 of RFC 5715<https://www.rfc-editor.org/rfc/rfc5715#section-6.2> and the SR Micro-Loop Avoidance<https://datatracker.ietf.org/doc/html/draft-bashandy-rtgwg-segment-routing-uloop-15>  draft – or, rather, common expectations from this draft.

RFC 5715 has been published in 2010. With the tunneling techniques available at that time in the industry I suspect that “tunnels whose path is not affected by the topology change”  in this section have been implicitly presumed to be RSVP-TE tunnels – simply because no other tunneling technology was available at that time (I do not think that source routing in IP has been seriously considered).

Nearside tunnelling does not need RSVP. Any ingress that will use the PLR to deliver a packet via the failed link will always be able to reach the PLR since it is on the nearside of the failure, so all you need to do is to push a label that is associated with the PLR router and the packet will get to the PLR where it will be popped to reveal the label associated with an entity reachable via the failed link which them triggers a repair action on that packet. I cannot remember if we wrote that down, but as I remember we considered it obvious at the time. This needs no signalling beyond the normal normal IGP and MPLS LDP.




The context of the SR Micro-Loop Avoidance draft is Segment Routing, and RSVP-TE is, most probably, out of scope. With SR, the tunnels that are not affected by topology changes are implemented as contiguous lists of unprotected Adj-SIDs – but forwarding HW is quite limited regarding the length of such lists. Therefore, the common expectation from the SR Micro-Loop avoidance  (as well from TI-:FA) is that that it uses tunnels whose paths are not affected by the topology change and that are implementing using a reasonably short lists of Node SIDs and Adj-SIDs.

For link failure with loop free support you never need more that two labels: one to get you to the edge of P space, and if the P mode is not a PQ node, a second table to get you into Q space.

A router already has the label t reach the P router, and pre the work on SR we proposed to use TLDP, but now you would use an SR label.

So you only ever need one label at ingress and you already have that. You need at most two labels at the PLR, one normal label that you already have and one adjacency label that you could have got from T-LDP but which you more conveniently get from the SR system.

This is much simpler than the TiLFA approach and just works.



Section 6 of the TI-LFA draft describes how such paths can be computed in the case of a single link failure, and the constrain of post-convergence is one way to guarantee that they are not affected by topology changes.

SR Micro-Loop Avoidance draft, for the last 7 years, repeats the promise to define approaches for computing such paths in Section 3 which remains unchanged from the -00 version and until this day.
I admit that I am not aware of another way to guarantee that the path that is implemented as a sequence of SISD and includes Node SIDs would not be affected by the topology change.

What do you think?

I think we are making a simple problem much harder than it needs to be.

Best regards

Stewart



Regards,
Sasha





Regards,
Sasha

From: Stewart Bryant <stewart.bryant@gmail.com<mailto:stewart.bryant@gmail.com>>
Sent: Thursday, November 2, 2023 1:29 PM
To: Alexander Vainshtein <Alexander.Vainshtein@rbbn.com<mailto:Alexander.Vainshtein@rbbn.com>>
Cc: Stewart Bryant <stewart.bryant@gmail.com<mailto:stewart.bryant@gmail.com>>; rtgwg@ietf.org<mailto:rtgwg@ietf.org>; rtgwg-chairs <rtgwg-chairs@ietf.org<mailto:rtgwg-chairs@ietf.org>>; draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org<mailto:draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org>; Gyan Mishra <hayabusagsm@gmail.com<mailto:hayabusagsm@gmail.com>>
Subject: Re: [EXTERNAL] draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment

I just want to correct something

You do not of course need to tunnel if the packets only go though nodes that are shielded from the knowledge that the link has failed and thus ego not reconverge. A method such as RFC5715 section 6.7 - Ordered FIB update - does not require a tunnel because it causes the Q space to gradually expand and P space to gradually contract until the PLR is subsumed into Q space.

- Stewart



On 2 Nov 2023, at 11:20, Stewart Bryant <stewart.bryant@gmail.com<mailto:stewart.bryant@gmail.com>> wrote:



On 2 Nov 2023, at 08:56, Alexander Vainshtein <Alexander.Vainshtein@rbbn.com<mailto:Alexander.Vainshtein@rbbn.com>> wrote:

Stewart and all,
I have looked up RFC5715 Section 6.2<https://www.rfc-editor.org/rfc/rfc5715#section-6.2> and I agree that it is similar to the SR Micro-Loop Avoidance<https://datatracker.ietf.org/doc/html/draft-bashandy-rtgwg-segment-routing-uloop-15> draft.

Specifically, explicitly mentions usage of  “tunnels whose path is not affected by the topology change” which is quite close IMHO to what the SR Micro-Loop avoidance draft is about and quite close to post-convergence paths used in TI-LFA.
At the same time there are some differences. Specifically, RFC mentions “a new "loop-prevention" routing message” being issued by the router adjacent to failure. No such message is required in the SR Micro-Loop Avoidance draft.

There are two ways of looking at this - new as in of a new type - and new as in a new message is issued.

What ever you do you need a message to trigger loop prevention otherwise nodes remote from the failure will not know that it is needed. This could be done in one of two ways, either a bespoke fast flooded message, or you can trigger it from the routing LSP that will be issued by the PLR. It is not clear how fast the LSP flooding message will reach the all the nodes in P space.

Either way you need a message to trigger loop prevention.


I also think that the proposal, in the case of a link failure, to tunnel traffic to the nearest node adjacent to failure, is problematic. (Of course, the SR Micro-Loop Avoidance draft does not provide any approach for computing micro-loop avoiding paths with limited depth of added label stacks at all, it just repeats the promise to provide reference approaches starting from version -00 and until now).

There are of course multiple ways of avoiding loops called up in RFC5715, but all of them require that all packets arriving at any ingress in the network that would originally go to the PLR either go direct to Q space or continue in P space to the PLR are repaired. If they are going to the PLR they need to be tunnelled where for the purposes of this discussion encapsulating in an SR packet is considered to be a tunnel.

Anyway you did not comment on my point that if we need loop free anyway the congruence of the PLR path to the PQ node is no longer a hard requirement.

Best regards

Stewart


My 2c,
Sasha

From: Stewart Bryant <stewart.bryant@gmail.com<mailto:stewart.bryant@gmail.com>>
Sent: Thursday, November 2, 2023 10:15 AM
To: Alexander Vainshtein <Alexander.Vainshtein@rbbn.com<mailto:Alexander.Vainshtein@rbbn.com>>
Cc: Stewart Bryant <stewart.bryant@gmail.com<mailto:stewart.bryant@gmail.com>>; rtgwg@ietf.org<mailto:rtgwg@ietf.org>; rtgwg-chairs <rtgwg-chairs@ietf.org<mailto:rtgwg-chairs@ietf.org>>; draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org<mailto:draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org>; Gyan Mishra <hayabusagsm@gmail.com<mailto:hayabusagsm@gmail.com>>
Subject: Re: [EXTERNAL] draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment

As far as I can see SR Microloop avoidance is RFC5715 Section 6.2 nearside tunnelling.

That works unconditionally regardless of the Ti-LFA constraints.

My point is that as soon as you recognise the need to introduce one of the RFC5715 micro loop avoidance methods you admit that TiLFA does not unconditionally address micro loops and thus the TiLFA repair topology constraint is no longer REQUIRED. An implementor may chose to do it, but it becomes OPTIONAL.

I think that this needs a discussion chaired by the RTGWG chairs either during IETF 118 or at a side meeting or at an online interim.

Stewart


On 2 Nov 2023, at 07:55, Alexander Vainshtein <Alexander.Vainshtein@rbbn.com<mailto:Alexander.Vainshtein@rbbn.com>> wrote:

Stewart and all,
Please see some comments inline below.

Regards,
Sasha

From: Stewart Bryant <stewart.bryant@gmail.com<mailto:stewart.bryant@gmail.com>>
Sent: Thursday, November 2, 2023 9:30 AM
To: Gyan Mishra <hayabusagsm@gmail.com<mailto:hayabusagsm@gmail.com>>
Cc: Stewart Bryant <stewart.bryant@gmail.com<mailto:stewart.bryant@gmail.com>>; Alexander Vainshtein <Alexander.Vainshtein@rbbn.com<mailto:Alexander.Vainshtein@rbbn.com>>; rtgwg@ietf.org<mailto:rtgwg@ietf.org>; rtgwg-chairs <rtgwg-chairs@ietf.org<mailto:rtgwg-chairs@ietf.org>>; draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org<mailto:draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org>
Subject: [EXTERNAL] Re: draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment

Let me ask a fundamental question.

The whole point of Ti-LFA as sold to the community was that repairing along the post convergence path as opposed to repairing along a convenient temporary path avoided micro loops.
[[Sasha]] To the best of my understanding, repairing along the post-convergence path prevents micro-loops on these paths due to distributed IGP convergence – as long as traffic follows these paths.  It does not – and, obviously, cannot, have any impact on micro-loops happening elsewhere due to distributed IGP convergence – neither prevents micro-loops that would form without TI-LFA, nor introduces any new ones.
The repair path constraint and subsequent segment optimisations add complexity to the calculation of the path.

Are we are now saying that micro loops can form elsewhere and as a consequence we need a micro loop avoidance strategy?
[[Sasha]] TI-LFA, same as any other form of LFA that I am aware of, handles just link/node failures.  Micro-loops can happen – and, from my experience, frequently DO happen – during repair of a failed link/node.  IMHO and FWIW this alone justifies the need for the micro-loop avoiding strategy/solution.

If so the fundamental premise behind TiLFA is broken and the repair can simply become: use SR to expeditiously route the packets into Q space and run a micro loop avoidance strategy. This approach removes the complexity of constraining the repair to the post convergence path. [[Sasha]] Please see my previous comment about TI-LFA paths being micro-loop avoidant because they are post-convergence paths.  In other words, one possible micro-loop avoidance strategy is usage of post-convergence paths in the transient period – and this, in a nutshell, is what the SR Micro-Loop Avoidance draft<https://datatracker.ietf.org/doc/html/draft-bashandy-rtgwg-segment-routing-uloop-15> is about (no offence intended).

Of course an implementor could cheat and just used the simplified strategy I describe above and almost certainly very few operators would notice because:

1) In many cases the two paths would be congruent

2) The transient is short and quite difficult to instrument

3) Unless there was some security reason or traffic management reason for the path constraint few would care.

I will look at the proposed differences later, but this sounds like it should be a topic for discussion in RTGWG before the text is finalised and sent the RFC editor.
[[Sasha]] The RTGWG agenda at IETF-118<https://datatracker.ietf.org/meeting/118/materials/agenda-118-rtgwg-02> seems tightly packed already. I wonder if a side meeting for such a discussion could be set in a way that allows online participation?

- Stewart



On 2 Nov 2023, at 05:09, Gyan Mishra <hayabusagsm@gmail.com<mailto:hayabusagsm@gmail.com>> wrote:


Hi Sasha, Bruno & Stewart

Thank you for going over my OPSDIR review in detail.

I am good with the latest updated verbiage that Bruno had given.

Comments in-line

On Mon, Oct 23, 2023 at 8:41 AM Alexander Vainshtein <Alexander.Vainshtein@rbbn.com<mailto:Alexander.Vainshtein@rbbn.com>> wrote:
Bruno,
Lots of thanks for a prompt and very encouraging response!

Your version of the text is definitely better than mine, I am all for using it.

As for where the clarifying text could be inserted, I see two options:

•       A common “Applicability Statement” section (there is no such section in the draft)


•

•       A dedicated section on relationship between TI-LFA and micro-loops.
    Gyan> I think this option would  be best.  This would fix the existing gap on uLoop.  I did mention but not sure if possible- as TI-LFA and uLoop are tightly coupled as a overall post convergence solution is it possible to combine the drafts and issue another WGLC.  (Question for authors)
In any case, I defer to you and the rest of the authors to decide what, if anything should be done for clarifying the relationship