Re: [EXTERNAL] draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment

Stewart Bryant <stewart.bryant@gmail.com> Mon, 06 November 2023 08:49 UTC

Return-Path: <stewart.bryant@gmail.com>
X-Original-To: rtgwg@ietfa.amsl.com
Delivered-To: rtgwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id A5F71C14CE3F; Mon, 6 Nov 2023 00:49:52 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.104
X-Spam-Level:
X-Spam-Status: No, score=-7.104 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id IvSFqk-HWVhe; Mon, 6 Nov 2023 00:49:50 -0800 (PST)
Received: from mail-ej1-x62f.google.com (mail-ej1-x62f.google.com [IPv6:2a00:1450:4864:20::62f]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 9120BC14CEFC; Mon, 6 Nov 2023 00:49:49 -0800 (PST)
Received: by mail-ej1-x62f.google.com with SMTP id a640c23a62f3a-9d0b4dfd60dso613422966b.1; Mon, 06 Nov 2023 00:49:49 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1699260588; x=1699865388; darn=ietf.org; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=sTMTbayiXplahtHEX8O3JJUFqafHdM3rhUzITyHhxYw=; b=BDEwCtBkU+e7EH9QIFppTdugDTdo2Rejo5CbtCE2CFlrFjqNZuYr3FcNoe6ZNbZwF1 +BJ02dM6btNcVcNNaNzXEPEXf2x1qy2T6ME6qjeXtGPximBl3zPVM6NVGEKJyjbHQA08 7HM5FGAs+Cxe0I9lunXKcqAxXDVpjM8KqbQSXBQLT3Uxxv72wllJ3QAOjybUAv+N3qP3 82/0nw4VGrJrwyhjH7b3zoFbkxOw3h4mCTc7ebA9oLfqw5fvakOonNGfQtZ6RD/bm+Qm m4cLlMc1xpyxgF3MB7l7hJ16NuSKgaRmeKh1lAwZ7w7Uhb2DOTKm2TAELXG8MneOnxfY 01NQ==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699260588; x=1699865388; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=sTMTbayiXplahtHEX8O3JJUFqafHdM3rhUzITyHhxYw=; b=fqmvJfOBRM/BqvDkrvHC6hsAF64fWPifoIuetQbJN4U/hlSgqa8ob06KESsx7JDesH rTdMJ1NjltZyUG25Pu14as5ckknFgkrO0DFaN/CB3sV5nUU0HvNK8Y/X1I3Odxw9+sej Nv7NFcphb5tkmbquYC9O92e596/uPBcvLDV9sWx/yuwF22ReS3XFQkz5V/bLbVeHfUNA f0BCKwGnfQXtxgRI0dzS2oDY8RnvcC5W3DlY0NvFpaVPghB1VcpG/oDsacA9NEI+gIw2 QdACbamwsql4oWD/7VW5rjAoethRaiBvKItNfNRWuVyYtmZ200zkrlFhBouVlpytPifj Dsdg==
X-Gm-Message-State: AOJu0YxK8Wb1N04juXP6HJJgIAauDqNs4ZKp8P5pJ2jjChmc5fvZM/ON Ys/LtTht689nmOmYMzU/4pUZcawT0Tw=
X-Google-Smtp-Source: AGHT+IFwr8s9dVPeiEmOknkHixJ48N1G2mCmCz3a1GQkp11pzj4y8+C9t1cXnT0Pc/Iig1VNxm+1Sw==
X-Received: by 2002:a17:907:d502:b0:9db:dfb0:a35f with SMTP id wb2-20020a170907d50200b009dbdfb0a35fmr9216159ejc.18.1699260587473; Mon, 06 Nov 2023 00:49:47 -0800 (PST)
Received: from smtpclient.apple ([185.69.145.59]) by smtp.gmail.com with ESMTPSA id y12-20020a1709060a8c00b009ce03057c48sm3862774ejf.214.2023.11.06.00.49.46 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Mon, 06 Nov 2023 00:49:46 -0800 (PST)
From: Stewart Bryant <stewart.bryant@gmail.com>
Message-Id: <52855BF0-5C0C-437B-B3F8-77473FE1AD2A@gmail.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_F99DA35B-B52F-4A27-B59E-5BBA2612CCF5"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.100.2.1.4\))
Subject: Re: [EXTERNAL] draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment
Date: Mon, 06 Nov 2023 08:49:35 +0000
In-Reply-To: <PH0PR03MB6300F0FF5221D45EB4B75DC4F6AAA@PH0PR03MB6300.namprd03.prod.outlook.com>
Cc: Stewart Bryant <stewart.bryant@gmail.com>, Gyan Mishra <hayabusagsm@gmail.com>, "rtgwg@ietf.org" <rtgwg@ietf.org>, rtgwg-chairs <rtgwg-chairs@ietf.org>, "draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org" <draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org>
To: Alexander Vainshtein <Alexander.Vainshtein@rbbn.com>
References: <9908D9F3-45C6-497D-B3BF-84D8A68A5013@gmail.com> <AS2PR02MB88395D3114B0DEE583BEEF65F0D7A@AS2PR02MB8839.eurprd02.prod.outlook.com> <60124119-5847-4F52-8BB8-18398A9BA4AC@gmail.com> <AS2PR02MB8839FB5A5537FC3E9F37A560F0D4A@AS2PR02MB8839.eurprd02.prod.outlook.com> <PH0PR03MB63004F32F9AF282ECDB78637F6D9A@PH0PR03MB6300.namprd03.prod.outlook.com> <AS2PR02MB88393EC50B913A5F8C3AB5E2F0D8A@AS2PR02MB8839.eurprd02.prod.outlook.com> <PH0PR03MB6300D9A7F9DC3E2E864EF11EF6D8A@PH0PR03MB6300.namprd03.prod.outlook.com> <CABNhwV30uhLOo52WHAv6YS4Wg0k9gDbkrs1ANuGPPdLzc1=dsw@mail.gmail.com> <PH0PR03MB6300958E56135029D7D336AEF6A6A@PH0PR03MB6300.namprd03.prod.outlook.com> <CABNhwV1T8Wg-JGf3Xi0=KYXut0pyah1PKOxY3edoFeTts+99iQ@mail.gmail.com> <PH0PR03MB6300F6764AD67F5A08321B33F6ABA@PH0PR03MB6300.namprd03.prod.outlook.com> <CABNhwV25U7k8r3KoYaaX9dTVduh942V9FNpfGKHMw4rST6jxxQ@mail.gmail.com> <PH0PR03MB6300F0FF5221D45EB4B75DC4F6AAA@PH0PR03MB6300.namprd03.prod.outlook.com>
X-Mailer: Apple Mail (2.3774.100.2.1.4)
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtgwg/S_GZ0a8EgWxBGib4rAFUhCnefVU>
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Routing Area Working Group <rtgwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtgwg/>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 06 Nov 2023 08:49:52 -0000


> On 6 Nov 2023, at 08:31, Alexander Vainshtein <Alexander.Vainshtein@rbbn.com> wrote:
> 
>  
> Gyan,
> Lots of thanks for your email.
>  
> I hope you (and others) will be able to attend the side meeting proposed by Yingzhen.
>  
> A few technical comments:
>  
> I do not think that TI-LFA or micro-loop avoiding paths comprised of long lists of Adj-SIDs are practical

SB> Correct and are only needed for node failure, particularly if you remove the congruence requirement, which is as far as I can see not needed.

> I think that, in the case of a single link failure:
> TI-LFA paths computed in accordance with the rules defined in Section 6.1/6.2/6.3 of the TI-LFA draft are micro-loop avoiding regardless of the distributed nature of IGP convergence.  E.g., a path that uses a PQ node will not be affected by the distributed nature of IGP convergence, because:
>                                                                i.      The path between the PLR and its adjacent node to whose P-space the PQ-node belongs is not affected by IGP convergence
>                                                              ii.      The sets of pre-convergence and post-convergence shortest paths from the above-mentioned adjacency of the PLR to the PQ node are the same – this is the definition of the P-space
>                                                            iii.      The sets of pre-convergence and post-convergence shortest paths from the PQ-node to the destination are the same – this is the definition of the Q-space
>                                                            iv.      The “stitching” of the two paths performed by the PQ-node is not affected by IGP convergence as well.

All correct and is well documented in the various FRR framework and the RLFA work.

> TI-LFA is applied to traffic to the destinations affected by the failure by the PLRs immediately after the failure has been detected, and, if it uses one of the above-mentioned techniques, its paths will not be affected by the distributed nature of IGP convergence. These paths should be kept long enough for the rest of the network to apply their micro-loop avoiding paths and then to switch to “normal” SPF paths
Again that is well documented in the loop free work/
> All other (non-PLR) nodes should:
>                                                                i.      Compute and apply micro-loop avoidance paths once they complete collection of information about the topology change and identify the change as a single link failure
>                                                              ii.      Keep these paths long enough to let the rest of the nodes to do the same, and then switch to “normal” SPF.
>  

Again this is what the loop free framework teaches.

So we know how to make this work, and I am not sure why TiLFA needs to complicate it.

A minimum solution for link failure is RLFA with an SR adjacency SID to get from P to Q is there is no PQ node in either P or extend P space. Plus either a simple but slow approach to convergence using incremental metrics, or a tunnel from every ingress to either the PLR or into Q space followed by an agreed delay to allow for convergence of the nodes in P space. Ti LFA MUST be deployed with a convergence strategy and any of the approaches I describe in this paragraph are mutually compatible and backwards compatible with TiLFA.

Stewart



>  
> Regards,
> Sasha
>  
> From: Gyan Mishra <hayabusagsm@gmail.com> 
> Sent: Monday, November 6, 2023 2:19 AM
> To: Alexander Vainshtein <Alexander.Vainshtein@rbbn.com>
> Cc: bruno.decraene@orange.com; draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org; rtgwg-chairs <rtgwg-chairs@ietf.org>; rtgwg@ietf.org
> Subject: Re: [EXTERNAL] Re: draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment
>  
> Hi Sasha 
>  
> Welcome and thank you for all your detailed analysis and engagement on this thread!
>  
> Responses in-line 
>  
> On Sun, Nov 5, 2023 at 7:56 AM Alexander Vainshtein <Alexander.Vainshtein@rbbn.com <mailto:Alexander.Vainshtein@rbbn.com>> wrote:
> Gyan,
> Lots of thanks for your email.
>  
> I fully agree that TI-LFA draft should be published ASAP. Hopefully the clarifications proposed by Bruno and myself would suffice for resolving most of concerns regarding relationship between TI-LFA and micro-loops.
>  Gyan> Agreed
> I think what also should be added in the update is relationship between TI-LFA post convergence path and use of minimum number of sids prefix-sid/node sid, and issue with ECMP nature of prefix sid use in case of a chain of nodes with cascading delays in convergence times that requires uloop avoidance list of Adj-sid list in cases where the extended P space for RLFA physical loop style topology is not converged and requires tunneling over the chain of nodes. In the case where the post convergence path is loop free then prefix-sid/node-sid ECMP is not an issue and am good there.
> For the interaction between TI-LFA and uLoop, for example -  TI-LFA applies  a loop free policy along the post convergence path static sid list at time T1, so at what time T2 does the uLoop kick in?  uLoop solution does not have a pre programmed path as it’s built based on near or far side tunneling post convergence path after the failure has occurred. So the only time uLoop would kick is with uLoop detection of micro loops in the topology which I think the uLoop or TI-LFA drafts  do not discuss.  Once uLoops are found either in P or Q space for the RLFA then a policy with hop by hop list of Sid’s is created for the SR policy from extended P to Q space PQ node.  I don’t see either draft mentioning this detailed interaction and sequence of events from failure to recovery on backup path.  The key here is with uLoop avoidance as opposed to TI-LFA is that TI-LFA is “not tunneled” as tunneling is not required as the post convergence path is assumed to be loop free.  However, if not loop free within the extended P space with chain of nodes with cascading convergence delays then some sort of RFC 5715 near of far side tunneling needs to be implemented to tunnel over all the nodes not converged.  So I think the critical interaction here with TI-LFA and SR uLoop as opposed to RFC 7490 T-LDP based uLoop is that with when loops exist pre or post convergence with traditional MPLS you specify a single FRR label and the traffic is tunneled to the Q space, however with SR prefix/node-sid it’s ECMP, so no tunneling and that is an issue.  
> In the uLoop draft figure 1 example it shows the RFC 5715 example of using far side tunneling.  The reason for far side is by guess it scales better but I think that far side solution is only applicable for traditional MPLS where a single FRR label pushed yields a tunnel where with SR-MPLS prefix/node-sid yields ECMP and thus no tunneling and so AFAIK far side won’t work with SR-MPLS.  I think you have to use near side tunneling with list of hop by hop Adj-sid.  I have not tested this in the lab but I don’t think far side will work with SR when micro loops exist on the post convergence path in a looped physical topology scenario with a chain of nodes with convergence delays.  Distributed tunneling is the same as near side so I think AFAIK due to ECMP issue with SR you have to use near side tunneling for uLoop avoidance. 
> The SR Micro-loop Avoidance draft indeed exists for 7+ years already and is quite stable. Unfortunately, stability also includes Section 3 of the draft that remains unchanged from 00 to -15 (current) version.
>     Gyan> Section 3, it’s mentioned that the policy is out of scope but I think it should be in scope as the goal of TI-LFA is local protection which includes recovery on the P space RLFA post convergence backup path and uLoop invokes as well recovery along the extended P space to Q space PQ node that includes all of the chain of cascading delayed convergence nodes.  So I think that both TI-LFA and uLoop are involved in SR policy based recovery process and AFAIK should be in scope  used for protected convergence and recovery along the backup path.
>  
> And in any case micro-loop avoidance includes the case of recovering links  while TI-LFA only deals with links failures.
> This alone looks to me a valid reason not to merge these drafts.
>  
> My 2c,
> Sasha
>  
> From: Gyan Mishra <hayabusagsm@gmail.com <mailto:hayabusagsm@gmail.com>> 
> Sent: Friday, November 3, 2023 4:43 AM
> To: Alexander Vainshtein <Alexander.Vainshtein@rbbn.com <mailto:Alexander.Vainshtein@rbbn.com>>
> Cc: bruno.decraene@orange.com <mailto:bruno.decraene@orange.com>; draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org <mailto:draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org>; rtgwg-chairs <rtgwg-chairs@ietf.org <mailto:rtgwg-chairs@ietf.org>>; rtgwg@ietf.org <mailto:rtgwg@ietf.org>
> Subject: Re: [EXTERNAL] Re: draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment
>  
> Hi Sasha 
>  
> In-line below 
>  
> On Thu, Nov 2, 2023 at 4:01 AM Alexander Vainshtein <Alexander.Vainshtein@rbbn.com <mailto:Alexander.Vainshtein@rbbn.com>> wrote:
> Gyan and all,
> Inline below.
>  
> Regards,
> Sasha
>  
> From: Gyan Mishra <hayabusagsm@gmail.com <mailto:hayabusagsm@gmail.com>> 
> Sent: Thursday, November 2, 2023 7:09 AM
> To: Alexander Vainshtein <Alexander.Vainshtein@rbbn.com <mailto:Alexander.Vainshtein@rbbn.com>>
> Cc: bruno.decraene@orange.com <mailto:bruno.decraene@orange.com>; draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org <mailto:draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org>; rtgwg-chairs <rtgwg-chairs@ietf.org <mailto:rtgwg-chairs@ietf.org>>; rtgwg@ietf.org <mailto:rtgwg@ietf.org>
> Subject: [EXTERNAL] Re: draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment
>  
>  
> Hi Sasha, Bruno & Stewart 
>  
> Thank you for going over my OPSDIR review in detail.
>  
> I am good with the latest updated verbiage that Bruno had given.
>  
> Comments in-line 
>  
> On Mon, Oct 23, 2023 at 8:41 AM Alexander Vainshtein <Alexander.Vainshtein@rbbn.com <mailto:Alexander.Vainshtein@rbbn.com>> wrote:
> Bruno,
> Lots of thanks for a prompt and very encouraging response!
>  
> Your version of the text is definitely better than mine, I am all for using it.
>  
> As for where the clarifying text could be inserted, I see two options:
> · A common “Applicability Statement” section (there is no such section in the draft)
> 
>        
> ·  
> 
> · A dedicated section on relationship between TI-LFA and micro-loops.
> 
>     Gyan> I think this option would  be best.  This would fix the existing gap on uLoop.  I did mention but not sure if possible- as TI-LFA and uLoop are tightly coupled as a overall post convergence solution is it possible to combine the drafts and issue another WGLC.  (Question for authors)
>           [[Sasha]] Given the current state of the SR Micro-Loop Avoidance draft <https://datatracker.ietf.org/doc/html/draft-bashandy-rtgwg-segment-routing-uloop-15> (still an individual submission at -15 version) I doubt merging TI-LFA and micro-loop avoidance is a good idea.
>         Gyan> TI-LFA is a critical draft for operator SR deployments and I agree getting it published asap is a good idea.  All vendors that have implemented     TI-LFA  have implemented uLoop.  In reality any operator deploying TI-LFA would  always deploy uLoop avoidance at the same time per vendor recommendation.  The uLoop I-D  is 7 years old and  is mature as every vendor that has implemented TI-LFA has also implemented uLoop,  so I think this could be slam dunk to do a quick Adoption followed by expedite through WGLC and publish.  The other option is combine the drafts which may or may not be favorable to the WG.  
>  
> The uLoop basic concept is simple —>> building a list of adj-sid from PLR to RLFA PQ node merge point with a timer set at time T1 post convergence and removed when T2 timer pops.  Simple!  The solution for TI-LFA in my mind is not complete without uLoop.  The major issue that Stewart pointed out is related to multiple entry points or chain of P space nodes preceding the PLR or multiple Q space nodes preceding the RLFA PQ node merge point is what I documented in my review.  Any of those longer chain of nodes can have uLoop distributed convergence cascaded delays.
>  
> TI-LFA implementations aim to solve with optimized least number of SID to avoid hardware MSD issues to solve the problem using a single node-sid plus maybe an adj-sid and at most 4 sid’s.  Use of node-sid yields ECMP along the chain of nodes not yet converged resulting in many possible micro loops is the major issue that the  hop by hop list of adj-sid’s along the post convergence path solves with the uLoop draft.  
>  
> I don’t know of any other way to resolve the TI-LFA uLoop issue if implemented by itself if node-sid ECMP is utilized.  One option but unlikely is in case of chain of nodes exists, that TI-LFA if configured by itself w/o uLoop while signaling for MSD maximum threshold, can build an adj-sid list across the nodes not yet converged from PLR to PQ node merge point.  Other then trying to fix TI-LFA so it can work independently of uLoop feature is to do what we have been discussing in the thread about adding txt related to micro loops and interaction between       TI-LFA draft and uLoop draft.
>  
> Cheers,
>  
> Gyan
>  
> In any case, I defer to you and the rest of the authors to decide what, if anything should be done for clarifying the relationship between TI-LFA and micro-loops.
>  
>  
> Regards,
> Sasha
>  
> From: bruno.decraene@orange.com <mailto:bruno.decraene@orange.com> <bruno.decraene@orange.com <mailto:bruno.decraene@orange.com>> 
> Sent: Monday, October 23, 2023 3:27 PM
> To: Alexander Vainshtein <Alexander.Vainshtein@rbbn.com <mailto:Alexander.Vainshtein@rbbn.com>>
> Cc: rtgwg@ietf.org <mailto:rtgwg@ietf.org>; rtgwg-chairs <rtgwg-chairs@ietf.org <mailto:rtgwg-chairs@ietf.org>>; draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org <mailto:draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org>; Stewart Bryant <stewart.bryant@gmail.com <mailto:stewart.bryant@gmail.com>>
> Subject: [EXTERNAL] RE: draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment
>  
> Sasha,
>  
> Thanks for the summary and the constructive proposal.
> Speaking for myself, this makes sense and I agree.
>  
> Ø  TI-LFA is a local operation applied by the PLR when it detects failure of one of its local links. As such,  it does not affect:
> 
> o   Micro-loops that appear – or do not appear –on the paths to the destination that do not pass thru TI-LFA paths
> 
>  
> As an editorial comment, depending on where such text would be inserted, I would propose the following change:
> OLD: Micro-loops that appear – or do not appear –
> NEW: Micro-loops that appear – or do not appear – as part of the distributed IGP convergence [RFC5715]
>  
> Motivation: some reader could wrongly understand that such micro-loops are caused by TI-LFA
>  
> Thanks,
> Regards,
> --Bruno
>  
> Orange Restricted
> From: Alexander Vainshtein <Alexander.Vainshtein@rbbn.com <mailto:Alexander.Vainshtein@rbbn.com>> 
> Sent: Sunday, October 22, 2023 4:21 PM
> To: DECRAENE Bruno INNOV/NET <bruno.decraene@orange.com <mailto:bruno.decraene@orange.com>>; Stewart Bryant <stewart.bryant@gmail.com <mailto:stewart.bryant@gmail.com>>
> Cc: rtgwg@ietf.org <mailto:rtgwg@ietf.org>; rtgwg-chairs <rtgwg-chairs@ietf.org <mailto:rtgwg-chairs@ietf.org>>; draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org <mailto:draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org>
> Subject: RE: draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment
> Importance: High
>  
> Bruno, Stewart and all,
> I think that most of the things about TI-LFA and micro-loops have been said already (if in a slightly different context)  and are mainly self-evident.
> However, I share the feeling that somehow the relationship between TI-LFA and micro-loop avoidance has become somewhat muddled.
>  
> Therefore, I would like to suggest adding some text to the TI-LFA draft that clarifies this relationship, e.g., along the following lines:
> 1.       TI-LFA is a local operation applied by the PLR when it detects failure of one of its local links. As such,  it does not affect:
> 
> a.       Micro-loops that appear – or do not appear –on the paths to the destination that do not pass thru TI-LFA paths
> 
>                                                                                                               i.      As explained in RFC 5714, such micro-loops may result in the traffic not reaching the PLR and therefore not following TI-LFA paths
> 
>                                                                                                              ii.      Segment Routing may be used for prevention of such micro-loops as described in the micro-loop avoidance draft
> 
> b.       Micro-loops that appear – or do not appear - when the failed link is repaired (aside: the need for this line is based on personal experience☹)
> 
> 2.       TI-LFA paths are loop-free. What’s more, they follow the post-convergence paths, and, therefore, not subject to micro-loops due to difference in the IGP convergence times of the nodes thru which they pass
> 
> 3.       TI-LFA paths are applied from the moment the PLR detects failure of a local link and until IGP convergence at the PLR is completed. Therefore, early (relative to the other nodes) IGP convergence at the PLR and the consecutive ”early” release of TI-LFA paths may cause micro-loops, especially if these paths have been computed using the methods described in Section 6.2, 6.3 or 6.4 of the draft. One of the possible ways to prevent such micro-loops is local convergence delay (RFC 8333).
> 
> 4.       TI-LFA procedures are complementary to application of any micro-loop avoidance procedures in the case of link or node failure:
> 
> a.       Link or node failure requires some urgent action to restore the traffic that passed thru the failed resource. TI-LFA paths are pre-computed and pre-installed and therefore suitable for urgent recovery
> 
> b.       The paths used in the micro-loop avoidance procedures typically cannot be pre-computed.
> 
>  
> Hopefully these notes would be useful.
>  
> Regards,
> Sasha
>  
> From: rtgwg <rtgwg-bounces@ietf.org <mailto:rtgwg-bounces@ietf.org>> On Behalf Of bruno.decraene@orange.com <mailto:bruno.decraene@orange.com>
> Sent: Thursday, October 19, 2023 7:34 PM
> To: Stewart Bryant <stewart.bryant@gmail.com <mailto:stewart.bryant@gmail.com>>
> Cc: rtgwg@ietf.org <mailto:rtgwg@ietf.org>; rtgwg-chairs <rtgwg-chairs@ietf.org <mailto:rtgwg-chairs@ietf.org>>; draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org <mailto:draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org>
> Subject: [EXTERNAL] RE: draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment
>  
> Hi Stewart,
>  
> I agree with you on the technical points, so the first part of your email up to “So I think”.
>  
> But I don’t quite follow why you want to mix IGP Convergence issues with this Fast ReRoute Solution.
> To quote RFC 5714 « IP Fast Reroute Framework”
>  
> In order to reduce packet disruption times to a duration commensurate
>    with the failure detection times, two mechanisms may be required:
>  
>    a.  A mechanism for the router(s) adjacent to the failure to rapidly
>        invoke a repair path, which is unaffected by any subsequent re-
>        convergence.
>  
>    b.  In topologies that are susceptible to micro-loops, a micro-loop
>        control mechanism may be required [RFC5715 <https://datatracker.ietf.org/doc/html/rfc5715>].
>  
>    Performing the first task without the second may result in the repair
>    path being starved of traffic and hence being redundant.
>  
> https://datatracker.ietf.org/doc/html/rfc5714#section-4
>  
> I would assume that you agree with the above (as you are an author of this RFC and my guess would be that you wrote that text)
>  
> My point is that there are two different mechanisms involved, in two different time periods:
> -     Fast ReRoute (“a”): this is the scope of draft-ietf-rtgwg-segment-routing-ti-lfa
> o   Timing: from detection time , to start of the IGP convergence
> -     IGP Micro-loop avoidance (“b”)
> o   Timing: from start of IGP convergence to end of IGP convergence
>  
> The scope of draft-ietf-rtgwg-segment-routing-ti-lfa is FRR / “a”. IGP micro-loop is out of scope. Other documents are proposing solutions for this. (and for those Micro-loop documents, FRR is similarly out of scope).
>  
> Personally I agree with you that both mechanisms are needed. But I think that this is already highlighted in RFC 5714, and that this is no different than RFC 7490 (RLFA). Therefore, I don’t see why the outcome/text should be different. Hence my proposition to reuse the text from RFC 7490 (RLFA). I find it adequate. You wrote it so probably find it adequate.
>  
> On a side note, RFC5715, that you also wrote, seems to already cover what you are asking for. Quoting the abstract, it
>       provides a summary of the causes and consequences of
>    micro-loops and enables the reader to form a judgement on whether
>    micro-looping is an issue that needs to be addressed in specific
>    networks.
>  
> Note that this RFC5715 is already cited in the proposed text.
>  
> PS: If you were ready to wrote a 5715bis, I would support this.
>  
> Best regards,
> --Bruno
>  
>  
> Orange Restricted
> From: Stewart Bryant <stewart.bryant@gmail.com <mailto:stewart.bryant@gmail.com>> 
> Sent: Tuesday, October 17, 2023 1:48 PM
> To: DECRAENE Bruno INNOV/NET <bruno.decraene@orange.com <mailto:bruno.decraene@orange.com>>
> Cc: Stewart Bryant <stewart.bryant@gmail.com <mailto:stewart.bryant@gmail.com>>; rtgwg@ietf.org <mailto:rtgwg@ietf.org>; rtgwg-chairs <rtgwg-chairs@ietf.org <mailto:rtgwg-chairs@ietf.org>>; draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org <mailto:draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org>
> Subject: Re: draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment
>  
> Hi Bruno
>  
> I was thinking about this some more. It is something that was recognised in the early days, but somewhat swept aside.
>  
> The case that Gyan bought up was an ECMP case, but I fear that the case is more common and I think we should characterise it as part of the text rather that giving the impression it is unusual.
>  
> I think the problem occurs whenever there are two or more nodes between the point of packet entry and the failure.
>  
> CE1 - R1 - R2 - R3 - R4 -/- R5 - CE2
>       |                     |
>       R6 - R7 - R8 - R9 — R10
>  
> The normal path CE1-CE2 is via R2
>  
> When R4-R5 fails it is trivial to see how the repair works with R7 as the entry into Q space.
>  
> However unless R1, R2,  R3 converge in that order there will be microloops for traffic entering via any of those three nodes.
>  
> So I think we can say that unless the PLR is only receiving traffic to be protected directly or from its immediate neighbour it is not guaranteed that there  will not be micro loops that are not addressable by the propose strategy of aligning the repair path with the post convergence path.
>  
> Now thinking about the text you have below, I think we need to write in in terms of - Unless the operator is certain that no micro loops will form over any path the protected traffic will traverse between entry to the network and arrival at the PLR a micro loop avoidance method MUST be deployed. Of course I think that it would be helpful to the operator community for the text to provide some guidance on how to ascertain whether there is a danger of the formation of micro loops.
>  
> I would note that the long chains of nodes show in the example above were probably not present in the test topologies which as I remember were all national scale provider networks, but unless we provide guidance otherwise Ti-LFA could reasonably be deployed in edge networks and in the case of cell systems these are often ring topologies.
>  
> So I think we need to agree (as a WG) on the constrains that we are prepared to specify in the text and the degree of warning we need to provide to the operator community and then we can polish the text below.
>  
> Best regards
>  
> Stewart
>  
>  
>  
>  
> 
> On 16 Oct 2023, at 17:25, bruno.decraene@orange.com <mailto:bruno.decraene@orange.com> wrote:
>  
> Hi Stewart,
>  
> Please see inline
>  
>  
> Orange Restricted
> From: Stewart Bryant <stewart.bryant@gmail.com <mailto:stewart.bryant@gmail.com>>
> Sent: Monday, October 16, 2023 2:08 PM
> To: rtgwg@ietf.org <mailto:rtgwg@ietf.org>; rtgwg-chairs <rtgwg-chairs@ietf.org <mailto:rtgwg-chairs@ietf.org>>; draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org <mailto:draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org>
> Cc: Stewart Bryant <stewart.bryant@gmail.com <mailto:stewart.bryant@gmail.com>>
> Subject: draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment
>  
> During the operations directorate early review of draft-ietf-rtgwg-segment-routing-ti-lfa 
> Gyan Mishra points to a simple pathological network fragment that I think deserves wider discussion.
>  
> https://datatracker.ietf.org/doc/review-ietf-rtgwg-segment-routing-ti-lfa-11-opsdir-early-mishra-2023-08-25/ <https://datatracker.ietf.org/doc/review-ietf-rtgwg-segment-routing-ti-lfa-11-opsdir-early-mishra-2023-08-25>
>  
> I am not aware of any response to the RTGWG by the draft authors concerning the review comment and I cannot see obvious new text addressing this concern.
> 
> The fragment is as follows
> 
> CE1 –R1- R2-/-R3-CE2
>      |         |
>      R4 – R5 -R6
> 
> In the pre converged network R4 is ECMP CE2 via R5 (cost 4) and via R1 (cost also 4).
> 
> We can easily build a TI-LFA repair path from R2 under link failure to CE2 (so long as we remember that R4 is an ECMP path to CE2), but the problem occurs during convergence. If R1 converges before R4, R4 may ECMP packets addressed to CE2 back to R1 in a micro loop. Meanwhile since no packets for R3 are reaching R2 the Ti-LFA repair is not doing anything useful. 
> 
> The Ti-LFA text leads the reader to conclude that it is a loop-free solution, but gives no guidance on how to determine when this assumption breaks down. There is an informational reference to 
> draft-bashandy-rtgwg-segment-routing-uloop, but this short individual draft does little in the way of helping the reader determine when  loop avoidance strategy needs to be deployed and the loop-free approach it describes does not seem to be fully developed.
>  
> I am worried that proceeding with the Ti-LFA draft without noting that there is a real risk that simple network fragments can micoloop, and providing a fully formed mitigation strategy is a disservice to the operator community given the industry interest in Ti-LDA and the insidious nature of unexpected micro loop network transients, I am wondering what the view of the working group is on how to proceed.
>  
> One approach would be for the Ti-LFA draft to incorporate detailed guidance on how to determine the risk of a micro loop in a specific operator network, and to provide specific mitigation advice. Another approach would be to  reference a developed loop avoidance strategy and recommending its preemptive deployment. Another approach would be to make draft-bashandy-rtgwg-segment-routing-uloop a normative reference and tie the fate of the two drafts. Another approach would be to elaborate on the risks and their manifestations but declare it a currently unsolved problem. I am sure there are other options that the WG may formulate.
>  
> What is the opinion of the working group on how we should proceed with draft-ietf-rtgwg-segment-routing-ti-lfa when considering the possible formation of micro loops?
>  
> FRR takes place between the failure (detection) and the IGP reconvergence. Those are two consecutive steps that the WG has so far addressed with different solutions and documents.
> That’s not new and that’s not specific to TI-LFA. E.g., that’s applicable to RLFA.
>  
> Would the below text, taken verbatim from RFC 7490 (RLFA), work for you? Or would you say that the text is not good enough?
> “When the network reconverges, micro-loops [RFC5715 <https://datatracker.ietf.org/doc/html/rfc5715>] can form due to
>    transient inconsistencies in the forwarding tables of different
>    routers.  If it is determined that micro-loops are a significant
>    issue in the deployment, then a suitable loop-free convergence
>    method, such as one of those described in [RFC5715 <https://datatracker.ietf.org/doc/html/rfc5715>], [RFC6976 <https://datatracker.ietf.org/doc/html/rfc6976>], or
>    [ULOOP-DELAY <https://datatracker.ietf.org/doc/html/rfc7490#ref-ULOOP-DELAY>], should be implemented.”
>  
> https://datatracker.ietf.org/doc/html/rfc7490#section-10
>  
> Of course, we could update the list of informative references.
> E.g., by adding another informative reference to draft-bashandy-rtgwg-segment-routing-uloop and by removing informative references to [RFC6976] and [ULOOP-DELAY] which are probably outdated.
>  
> --Bruno
>  
>  
> - Stewart
> ____________________________________________________________________________________________________________
> Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
>  
> This message and its attachments may contain confidential or privileged information that may be protected by law;
> they should not be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and delete this message and its attachments.
> As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
> Thank you.
>  
> ____________________________________________________________________________________________________________
> Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
>  
> This message and its attachments may contain confidential or privileged information that may be protected by law;
> they should not be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and delete this message and its attachments.
> As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
> Thank you.
>  
> 
> Disclaimer
> 
> This e-mail together with any attachments may contain information of Ribbon Communications Inc. and its Affiliates that is confidential and/or proprietary for the sole use of the intended recipient. Any review, disclosure, reliance or distribution by others or forwarding without express permission is strictly prohibited. If you are not the intended recipient, please notify the sender immediately and then delete all copies, including any attachments.
> 
> ____________________________________________________________________________________________________________
> Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
>  
> This message and its attachments may contain confidential or privileged information that may be protected by law;
> they should not be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and delete this message and its attachments.
> As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
> Thank you.
> _______________________________________________
> rtgwg mailing list
> rtgwg@ietf.org <mailto:rtgwg@ietf.org>
> https://www.ietf.org/mailman/listinfo/rtgwg
> _______________________________________________
> rtgwg mailing list
> rtgwg@ietf.org
> https://www.ietf.org/mailman/listinfo/rtgwg