Re: draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment

Stewart Bryant <stewart.bryant@gmail.com> Wed, 08 November 2023 14:07 UTC

Return-Path: <stewart.bryant@gmail.com>
X-Original-To: rtgwg@ietfa.amsl.com
Delivered-To: rtgwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2B564C17DC07; Wed, 8 Nov 2023 06:07:42 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.104
X-Spam-Level:
X-Spam-Status: No, score=-7.104 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id f2ZpcZxQlxOL; Wed, 8 Nov 2023 06:07:41 -0800 (PST)
Received: from mail-lf1-x12c.google.com (mail-lf1-x12c.google.com [IPv6:2a00:1450:4864:20::12c]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id ADAA9C17DBF4; Wed, 8 Nov 2023 06:07:40 -0800 (PST)
Received: by mail-lf1-x12c.google.com with SMTP id 2adb3069b0e04-507c5249d55so9154830e87.3; Wed, 08 Nov 2023 06:07:40 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1699452459; x=1700057259; darn=ietf.org; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:from:to:cc:subject:date:message-id:reply-to; bh=lObVnvAvnc4JzDbhZmclFcaMQrypHDYzsV0+RSjCQgE=; b=M4P2ZvG2OxPQ14yjiBd+Bxa5rkbJS15XP/9FtC6FTB41tv6KU5GAqKys6TiDwa+Y2G XnQ1uTy1lyMxW3fUTJMHXsvtVjYTiMEQywABBBDp8bltIp2uD1oIdZ8pzO50iV+DkGQd 7pPAFm88+icRPCNazLlgy0g7PDUa9tnZ5zz8gf3/XLq7OBHCRa+f3crYy6QqthcE+Hi7 ZDWXonRi1Lk77d1lIvL+ZpJIbPkQL2MJMPFTaJ52xy+aytLCAPilXoWsPb/tiwRTySqH dGGEcyuzuFyY+d7+p+8AXiUf0WkOs3zT24vkkUeMJx54Nu6XrjfWSPKAoZhCVyT+uhZC azPg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699452459; x=1700057259; h=references:to:cc:in-reply-to:date:subject:mime-version:message-id :from:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=lObVnvAvnc4JzDbhZmclFcaMQrypHDYzsV0+RSjCQgE=; b=XTKaJDfIqiuByvdHsDt20HMf4ZnETMI09FwylF/71h1grGnntc1wZAFvH2OnuO44Zj 1cglBkHmAJsFmhnivQDZFKCnhwyVdV4v3mEopIFAh613u4GckBL9KJQFIT/I8/pWwC9d yuZOcazALcXoePtWit2WmWhET8n01o+Oad6yGtChvPL5VfFw1Y9acq4SiZRekVjQLNqJ ciqb09REmgju7RDmWC91JcpqKQtK7tpgYQc01WJznxKIV3++Vh+iF76pAF3jtYWJMguG OzVMRFDrgeDVcdIXwQtWGT3SB3A57MGNHT8A/5+r8MfBZQI6jgeRlx0rWCXIOPSg8PeK x0ag==
X-Gm-Message-State: AOJu0Yzk/vVGGIoXH4vWO5kl4B6S9wWlr8Dfd3IZOehsoY4U75QCdYqP q+19tZaQi7vw8parqChnvGIjHyBoYfo=
X-Google-Smtp-Source: AGHT+IEIOQEGID3WAcppHGeIFxhtDOrvBPeriq0OPBHX6HdU3HB3F2/GKl8YqyCRsZxUk5fZ1o1NSA==
X-Received: by 2002:a05:6512:3d1c:b0:507:99d6:95fc with SMTP id d28-20020a0565123d1c00b0050799d695fcmr1792692lfv.45.1699452458424; Wed, 08 Nov 2023 06:07:38 -0800 (PST)
Received: from smtpclient.apple ([85.255.232.214]) by smtp.gmail.com with ESMTPSA id g1-20020a5d46c1000000b0032ddf2804ccsm4992899wrs.83.2023.11.08.06.07.32 (version=TLS1_2 cipher=ECDHE-ECDSA-AES128-GCM-SHA256 bits=128/128); Wed, 08 Nov 2023 06:07:33 -0800 (PST)
From: Stewart Bryant <stewart.bryant@gmail.com>
Message-Id: <964FDDCB-C989-492C-8CA4-4E8CAB6DD212@gmail.com>
Content-Type: multipart/alternative; boundary="Apple-Mail=_2A8808AA-9AD1-42F7-9F3C-6C21AE738C8C"
Mime-Version: 1.0 (Mac OS X Mail 16.0 \(3774.100.2.1.4\))
Subject: Re: draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment
Date: Wed, 08 Nov 2023 14:07:21 +0000
In-Reply-To: <AS2PR02MB88394ADF431FCE125E238DB3F0A8A@AS2PR02MB8839.eurprd02.prod.outlook.com>
Cc: Stewart Bryant <stewart.bryant@gmail.com>, Ahmed Bashandy <abashandy.ietf@gmail.com>, Alexander Vainshtein <Alexander.Vainshtein@rbbn.com>, "rtgwg@ietf.org" <rtgwg@ietf.org>, "draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org" <draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org>, rtgwg-chairs <rtgwg-chairs@ietf.org>, Gyan Mishra <hayabusagsm@gmail.com>
To: bruno.decraene@orange.com
References: <9908D9F3-45C6-497D-B3BF-84D8A68A5013@gmail.com> <AS2PR02MB88395D3114B0DEE583BEEF65F0D7A@AS2PR02MB8839.eurprd02.prod.outlook.com> <60124119-5847-4F52-8BB8-18398A9BA4AC@gmail.com> <AS2PR02MB8839FB5A5537FC3E9F37A560F0D4A@AS2PR02MB8839.eurprd02.prod.outlook.com> <PH0PR03MB63004F32F9AF282ECDB78637F6D9A@PH0PR03MB6300.namprd03.prod.outlook.com> <AS2PR02MB88393EC50B913A5F8C3AB5E2F0D8A@AS2PR02MB8839.eurprd02.prod.outlook.com> <PH0PR03MB6300D9A7F9DC3E2E864EF11EF6D8A@PH0PR03MB6300.namprd03.prod.outlook.com> <CABNhwV30uhLOo52WHAv6YS4Wg0k9gDbkrs1ANuGPPdLzc1=dsw@mail.gmail.com> <ef40ab1f-90b3-56d2-4d22-02a8eaab3ee0@gmail.com> <CABNhwV1ud2RyH_hCb1NOtBWiQ15e5P6Qx0mvrgs7h+tS6PyS=w@mail.gmail.com> <34B1B7D3-E65C-4661-A460-B75797714F2C@gmail.com> <AS2PR02MB88394ADF431FCE125E238DB3F0A8A@AS2PR02MB8839.eurprd02.prod.outlook.com>
X-Mailer: Apple Mail (2.3774.100.2.1.4)
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtgwg/3CJjhn4zLUw2Lo4vRnP7M520GyQ>
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Routing Area Working Group <rtgwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtgwg/>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Nov 2023 14:07:42 -0000


> On 8 Nov 2023, at 13:18, bruno.decraene@orange.com wrote:
> 
> Hi Stewart,
>  
> Thanks for your email and your rephrased summary.
>  
> Strangely, I feel that we are in agreement. At least, unless I missed a point, I agree with your below email. I’d propose to rephrase it to check if you do agree with my rephase. (3-way handshake seems safer). Please find below my summary:
> TI-LFA is a FRR solution which works. It provides a loop-free protection from the PLR to the destination.
> When IGP convergence starts, micro-loop may happen because of this distributed IGP convergence. It may affect the forwarding from the source/ingress to the PLR (and hence starve the PLR)
> If one promised 50 ms recovery to their customers, one need both a FRR solution and a micro-loop solution. (TI-LFA being a FRR solution, you still need a micro-loop solution)
>  
> Are we in sync on the above?
> (on a side note, what we call “micro-loops” is “a possibility for micro-loops”. They may not happen (by topology or chance) in which case, the customer did see an improvement with FRR only)
>  
We agree. However I would note that it is hard to be sure of the topology because a failure may move the a network element and move the network from uloop free to uloop possible.

> If not, please correct me.
> If so,
> I agree. This is not new (IMO) and also applicable to RLFA, which did mention this (credit to you) in its section 10. https://datatracker.ietf.org/doc/html/rfc7490#section-10
> I had proposed you to add the same text in TI-LFA (for simplicity since you, WG and IESG already agree on this) but after discussion with you and Sasha the current proposed text is the following
>  
>  
> TI-LFA is a local operation applied by the PLR when it detects failure of one of its local links. As such,  it does not affect:
> Micro-loops that appear – or do not appear – as part of the distributed IGP convergence [RFC5715]on the paths to the destination that do not pass thru TI-LFA paths
>                                                     i.     As explained in RFC 5714, such micro-loops may result in the traffic not reaching the PLR and therefore not following TI-LFA paths
>                                                    ii.     Segment Routing may be used for prevention of such micro-loops as described in the micro-loop avoidance draft

How about:

Ii. Any of the methods described in RFC 5714 may be used to prevent the formation of micro loops and some of these methods may be enhanced, or new methods designed through the use of Segment Routing. A number of these methods may be used concurrently in the network.


> Micro-loops that appear – or do not appear - when the failed link is repaired
> TI-LFA paths are loop-free. What’s more, they follow the post-convergence paths, and, therefore, not subject to micro-loops due to difference in the IGP convergence times of the nodes thru which they pass
> TI-LFA paths are applied from the moment the PLR detects failure of a local link and until IGP convergence at the PLR is completed. Therefore, early (relative to the other nodes) IGP convergence at the PLR and the consecutive ”early” release of TI-LFA paths may cause micro-loops, especially if these paths have been computed using the methods described in Section 6.2, 6.3 or 6.4 of the draft. One of the possible ways to prevent such micro-loops is local convergence delay (RFC 8333). 
> TI-LFA procedures are complementary to application of any micro-loop avoidance procedures in the case of link or node failure:
> Link or node failure requires some urgent action to restore the traffic that passed thru the failed resource. TI-LFA paths are pre-computed and pre-installed and therefore suitable for urgent recovery
> The paths used in the micro-loop avoidance procedures typically cannot be pre-computed.
>  
>  
> https://mailarchive.ietf.org/arch/msg/rtgwg/oY3gGIZMRCTRptTDxrpuSaBztGY/ (proposal)
> https://mailarchive.ietf.org/arch/msg/rtgwg/oY3gGIZMRCTRptTDxrpuSaBztGY/ (next email with Sasha agreeing)
>  
> That being said, I’m not married with this text: it’s just that Sasha proposed text (thanks Sasha) and I agreed with it. It’s ok to change the text if you want to propose something else to change some parts. (Personally, I feel that the text could be made more synthetic/shorter, but after so many difficulties to communicate, I was happy to jump on a proposed text).
> I would just assume that the text you would propose would be on the same line.
>  
>  
> Next, is this micro-loop aspect the only issue you wanted to raise or is there another point?

Yes, this is the other key point. It should probably go on a section on microloops. 

Then I might suggest that we need a section similar to section 10 of RFC7490 that addresses the management considerations and points back to the uloop text in the TilFA draft.  This ensures that the operator community (and in particular their managers and accountants) are more easily made aware of this concern and are not tempted to optimise it away.

Thank you for considering these points

Stewart

>  
> --Bruno
>  
>  
> Orange Restricted
> From: Stewart Bryant <stewart.bryant@gmail.com <mailto:stewart.bryant@gmail.com>>
> Sent: Wednesday, November 8, 2023 9:37 AM
> To: Gyan Mishra <hayabusagsm@gmail.com <mailto:hayabusagsm@gmail.com>>
> Cc: Stewart Bryant <stewart.bryant@gmail.com <mailto:stewart.bryant@gmail.com>>; Ahmed Bashandy <abashandy.ietf@gmail.com <mailto:abashandy.ietf@gmail.com>>; Alexander Vainshtein <Alexander.Vainshtein@rbbn.com <mailto:Alexander.Vainshtein@rbbn.com>>; rtgwg@ietf.org <mailto:rtgwg@ietf.org>; draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org <mailto:draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org>; rtgwg-chairs <rtgwg-chairs@ietf.org <mailto:rtgwg-chairs@ietf.org>>
> Subject: Re: draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment
>  
>  
>  
> 
> On 8 Nov 2023, at 05:18, Gyan Mishra <hayabusagsm@gmail.com <mailto:hayabusagsm@gmail.com>> wrote:
>  
>  
>  
> In the below RLFA RFC 7490 style  loop topology R1, R4, R5 are in the extended P space and  and Q space being R5, R6, R3 and TO-LFA algorithm post convergence path calculated RLFA PQ node being R5.
>  
> Using section 6.4 to build the post convergence repair path using RFC 5715 near side tunneling the repair path is NodeSid(R5), AdjSid(R6). So a near side tunnel is now built from R1 to R6.  
>  
> Looping is not an issue with R4 or R5 in looping packets back to R1 as the repair path is built from R1 to R6, tunneling over any nodes with un-converged FIBs.
>  
> Micro loop problem solved!
>  
>  
> CE1 –R1- R2-/-R3-CE2
>      |         |
>      R4 – R5 -R6
>  
> I think that it is important to note that if R1 reconverges first it will send packets to R4 using normal forwarding. However R4 is ECMP to CE2 via R1 which will micro loop back to R4.
>  
> At this point the repair is starved and no longer works.
>  
> Hence the point that I have been making and I think the point that Gyan originally made.
>  
> Without FRR the network converges in its own time and we accept micro loops and traffic discontinuity for an unknown time plus collateral damage to traffic that never used the failed link.
>  
> However once we deploy FRR we make a contract with the user that after a short while - of the order of 50ms - productive forwarding will continue uninterrupted. However this is not the case in some topologies (see above) and thus uloop prevention is required.
>  
> The thread has become somewhat difficult to follow with time, so I am now not sure what Bono’s text is. It would be helpful if it were repeated. However I think the draft has to say  that in order to warrant that FRR continues to provide traffic continuity until the network is reconverged a uloop strategy is required.
>  
> I would note as it is easily forgotten that a uloop strategy is also required when R2-R3 goes back into service. This is because if R4 converges first it will ECMP back to R1 which will send the packet back to R1.
>  
> Now we need to be clear that the micro looking is not the fault of the TiLFA design per se, but given that networks will deploy TiLFA with certain traffic continuity expectations we must clearly note to the reader that those expectations may not be met without addressing the uloop problem.
>  
> By way of referencing earlier work, RFC5714 does point to RFC5715 stating that a uloop technology is needed. In Section 10 of RFC 7490 the issue of loops is drawn to the attention of the network manager although perhaps with hindsight the text should be stronger.
>  
> - Stewart
>  
>  
>  
>  
>  
>  
>  
>  
> ____________________________________________________________________________________________________________
> Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
> pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
> a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
> Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
> 
> This message and its attachments may contain confidential or privileged information that may be protected by law;
> they should not be distributed, used or copied without authorisation.
> If you have received this email in error, please notify the sender and delete this message and its attachments.
> As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
> Thank you.