Re: draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment

Gyan Mishra <hayabusagsm@gmail.com> Wed, 08 November 2023 05:18 UTC

Return-Path: <hayabusagsm@gmail.com>
X-Original-To: rtgwg@ietfa.amsl.com
Delivered-To: rtgwg@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 8084EC15C2BF; Tue, 7 Nov 2023 21:18:50 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -7.105
X-Spam-Level:
X-Spam-Status: No, score=-7.105 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vl_w97TrqmEk; Tue, 7 Nov 2023 21:18:49 -0800 (PST)
Received: from mail-qt1-x831.google.com (mail-qt1-x831.google.com [IPv6:2607:f8b0:4864:20::831]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 094DCC18FCDC; Tue, 7 Nov 2023 21:18:49 -0800 (PST)
Received: by mail-qt1-x831.google.com with SMTP id d75a77b69052e-41cc56255e3so38814901cf.3; Tue, 07 Nov 2023 21:18:48 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1699420728; x=1700025528; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=Og/ImaE7+0nASZidKQyJ15jObbcE+XiaWPKuZHOoYEU=; b=ZgRAkXPOhercrcK2HYWEM7zkdnS6dO1RY6SARums+HXRAA+AOxtDKkf20nZ1xtccr6 4xEYsYY2lACv/5SLCN7jlnvi7QKcPTB4axZXK8CwrJU9uO1rnVkwFL8PhAZrxLUBza6V qh7cAuEmXWPwHoN9WJ9D/OhWz/C8zAO+AisjkuW5BlRsrKK6pn7Z06stJYjB1NArN6Nf NBJ3emPVLPhvA/LqyrlvSS4mg1Z/27ZXx28Lw6iEqVJgmxPuKvwRCKXnCe51Kbpz7GOS LOGllMNUKSwlzn5A0Qj+HUqdPsgtTq54LQsPBZ47QB75qCiiquGBz0+o91eRLBtrnVPI sTRw==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1699420728; x=1700025528; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=Og/ImaE7+0nASZidKQyJ15jObbcE+XiaWPKuZHOoYEU=; b=HEdi5IQ3U7FdxawbgMjq9TAZ+P4X6dO26j00DFm3Ca7mdKhiW3r+ZCS/G9saMw3vRP 1dN7Mt3Eq3aSdUurMG4aZQkF73pOxi07GFkxWGM/njvhxk1DtjIuYhqNkLJtWKnF/5Q/ cGjCGDl4+BXDcbv//rv6slzLDq2bc7Y3IPP7IhIkpTcEk5gZc4G36Hij4WTuVIVmQw/Q 14xq3h1662EjbaPEhC7Ya1YqweVLxL5RH5ysQSevlo19paVswtFgMxY9sJ9a+6ZlhfQL rkqqHCV+6Me1vTpVGyZdsNusIAmnvfOaM2hpC7Xd7K7RJWmRV7ZZWEkSkDHI6v2uYu83 jNAQ==
X-Gm-Message-State: AOJu0YzDkiimSOcnJyKWCbxUrB2HFXOsqEktpk01SO8sy2t/EeynlAdb s8xVm1o7QtVcsh4IN+hX9c0DW2F71WtkYvTxnpo=
X-Google-Smtp-Source: AGHT+IFvhutYu9vu5q94OsVRqdzi4O3GYv4VqbzGyZr9kRA9sUjg+6lbuhMFGlHqVX/xYCmvcb+tEFeyVM8Bg1BV+tQ=
X-Received: by 2002:a05:622a:210:b0:41c:baf5:b500 with SMTP id b16-20020a05622a021000b0041cbaf5b500mr1050947qtx.47.1699420727867; Tue, 07 Nov 2023 21:18:47 -0800 (PST)
MIME-Version: 1.0
References: <9908D9F3-45C6-497D-B3BF-84D8A68A5013@gmail.com> <AS2PR02MB88395D3114B0DEE583BEEF65F0D7A@AS2PR02MB8839.eurprd02.prod.outlook.com> <60124119-5847-4F52-8BB8-18398A9BA4AC@gmail.com> <AS2PR02MB8839FB5A5537FC3E9F37A560F0D4A@AS2PR02MB8839.eurprd02.prod.outlook.com> <PH0PR03MB63004F32F9AF282ECDB78637F6D9A@PH0PR03MB6300.namprd03.prod.outlook.com> <AS2PR02MB88393EC50B913A5F8C3AB5E2F0D8A@AS2PR02MB8839.eurprd02.prod.outlook.com> <PH0PR03MB6300D9A7F9DC3E2E864EF11EF6D8A@PH0PR03MB6300.namprd03.prod.outlook.com> <CABNhwV30uhLOo52WHAv6YS4Wg0k9gDbkrs1ANuGPPdLzc1=dsw@mail.gmail.com> <ef40ab1f-90b3-56d2-4d22-02a8eaab3ee0@gmail.com>
In-Reply-To: <ef40ab1f-90b3-56d2-4d22-02a8eaab3ee0@gmail.com>
From: Gyan Mishra <hayabusagsm@gmail.com>
Date: Wed, 08 Nov 2023 00:18:36 -0500
Message-ID: <CABNhwV1ud2RyH_hCb1NOtBWiQ15e5P6Qx0mvrgs7h+tS6PyS=w@mail.gmail.com>
Subject: Re: draft-ietf-rtgwg-segment-routing-ti-lfa : A simple pathological network fragment
To: Ahmed Bashandy <abashandy.ietf@gmail.com>
Cc: Alexander Vainshtein <Alexander.Vainshtein@rbbn.com>, "bruno.decraene@orange.com" <bruno.decraene@orange.com>, "draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org" <draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org>, rtgwg-chairs <rtgwg-chairs@ietf.org>, "rtgwg@ietf.org" <rtgwg@ietf.org>
Content-Type: multipart/alternative; boundary="000000000000cf010706099d3818"
Archived-At: <https://mailarchive.ietf.org/arch/msg/rtgwg/-YeWRJFhVA6DM3PwysQhdp4XDmY>
X-BeenThere: rtgwg@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: Routing Area Working Group <rtgwg.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/rtgwg/>
List-Post: <mailto:rtgwg@ietf.org>
List-Help: <mailto:rtgwg-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/rtgwg>, <mailto:rtgwg-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 08 Nov 2023 05:18:50 -0000

Hi Ahmed, Authors, &  RTGWG WG,

After going through all the ML discussions by Sasha, Bruno, Stewart “Many
Thanks!!” on TI-LFA v11, here is a synopsis and update of my OPSEC Review
this past summer below:

I believe the key confusion in the draft is the reference to the uLoop
draft.  The confusion that arises is that the operational considerations
that      TI-LFA maybe lacking in uLoop avoiding mechanisms and in order to
deploy TI-LFA that uLoop should be deployed in conjunction.  This is not
true as we have all discussed on this thread and is an important point to
note why uLoop draft should not be referenced in any way by the TI-LFA
draft.

This draft as it stands today has inherent micro loop avoidance mechanisms,
 with near side tunneling from P to Q space along post conversion path
which by definition is a loop free path.  I did dig deeper a little bit
more as to why the post convergence path is loop free and I think that may
need some more text around it to make clear to the reader as well as
implementations.

Section 3 repair path clearly defines the micro loop avoiding mechanisms
for every scenario which can be accomplished with at most 2 SIDs Node-sid
to Q space plus Adj-sid to PQ node which covers 99% of cases with less than
1% needing maybe up to 4 SIDs maximum.  I completely agree no need for a
list of hop by hop Adj-sid as that does not provide any benefit and does
consume more resources.

In my review I detailed a  common “loop” physical topology such as below
where when building a RLFA near side tunnel from P to Q space PQ node that
the distributed convergence of nodes along the path may be an issue.

The key here is that as the path of the near side tunnel is built across
the loop free post convergence path made up of explicit path nodes, that
are calculated to be post convergence nodes picked by the TI-LFA algorithm
where the state of the FIB on those nodes does not matter which is what
makes the post convergence path a “loop free” path.

Important to note is that along the post convergence path there may exist
nodes  that have not converged and may result in a looping, however the key
is the post convergence nodes are FIB state independent which is all that
matters, thus  yielding a loop free repair path regardless of any trail of
miscellaneous un-converged nodes that still exist along the repair path.
This makes TI-LFA bullet proof!

ECMP is not an issue using prefix sid as the near side tunnel is built is
label switched MPLS ECMP entropy label ELI. EL underlay load balancing
path, across nodes that have non converged FIBs along the post convergence
path. Since the path is label switched to prefix sid of PQ node TI-LFA
calculated post convergence node, the LFIB entry is stable on other
un-converged nodes along the repair path, since it’s not the final
Destination D to egress PE node.  This is really a major major key point
AFAIK as well that TI-LFA algorithm to pick the loop free post convergence
nodes prevents any loop from occurring since we are hopping to a stable PQ
intermediate node along post convergence path.

Thus absolutely no need for uLoop draft and TI-LFA can completely work
independently and provide loop free convergence along the post convergence
repair path.

A point to note about the uLoop draft, as it progresses that it should
reference the interaction between TI-LFA and uLoop and timing and sequence
 of events when both features are configured together.  Also
recommendations to use one versus the other and pros and cons of each and
what use cases to use uLoop as well as when to use both together and why.
As TI-LFA came first, the onus is on uLoop draft to provide the
permutations of use cases and TI-LFA as normative reference.

Some points below looking at the use case mentioned in the OPSDIR review.

Here we have a typical trail of nodes use case such as here with node R4
and R5 that depict distributed IGP convergence along the post convergence
path where R1 is converges before R4 and R5 we can see that the TI-LFA
draft micro loop avoiding mechanisms works as designed per below.

In the below RLFA RFC 7490 style  loop topology R1, R4, R5 are in the
extended P space and  and Q space being R5, R6, R3 and TO-LFA algorithm
post convergence path calculated RLFA PQ node being R5.

Using section 6.4 to build the post convergence repair path using RFC 5715
near side tunneling the repair path is NodeSid(R5), AdjSid(R6). So a near
side tunnel is now built from R1 to R6.

Looping is not an issue with R4 or R5 in looping packets back to R1 as the
repair path is built from R1 to R6, tunneling over any nodes with
un-converged FIBs.

Micro loop problem solved!


CE1 –R1- R2-/-R3-CE2
     |         |
     R4 – R5 -R6



I am changing my OPSDIR review status —->>>
that the draft is in a “Ready” state and ready for publication with the
following recommendations:

-Remove any reference to uLoop draft

-Add Bruno’s updated text

-Provide more clarification on 6.1 FRR direct neighbor.  My understanding
AFAIK is that Ti-LFA is a RLFA solution for protection replacing RFC 7490
 T-LDP based RLFA with SR based RLFA.  As well that  AFAIK that TI-LFA is
an extension of the original IP FRR RFC 5286.  So in case where directly
connected path exists for  backup path via local protection in that case it
is a “Local LFA” path.  I don’t think TI-LFA comes into play here.  I have
tested this on Cisco XR on ISIS and OSPF and other vendors as well and when
a directly connected backup path exists the path is flagged as “Local-LFA”
and not    TI-LFA. So I don’t agree with section 6.1 that TI-LFA post
convergence path comes into play here.

-Please feel free to use some of the text in this email related to the
importance and characteristics of the post convergence path as well as how
the situation with ECMP is dealt with.

For tomorrows side meeting I could spin up a few slides based on this email
if it would help steer the discussion to closure as I think we are almost
there.

Many thanks to all the folks that participated in this lively discussion!!!


I wish I was in Prague!   Maybe next time!

Cheers!!

Gyan

On Mon, Nov 6, 2023 at 9:36 AM Ahmed Bashandy <abashandy.ietf@gmail.com>
wrote:

> great
>
> I'll change the wording accordingly
>
> Ahmed
>
> On 11/1/23 10:09 PM, Gyan Mishra wrote:
> > Hi Sasha, Bruno & Stewart
> >
> > Thank you for going over my OPSDIR review in detail.
> >
> > I am good with the latest updated verbiage that Bruno had given.
> >
> > Comments in-line
> >
> > On Mon, Oct 23, 2023 at 8:41 AM Alexander Vainshtein <
> > Alexander.Vainshtein@rbbn.com> wrote:
> >
> >> Bruno,
> >>
> >> Lots of thanks for a prompt and very encouraging response!
> >>
> >>
> >>
> >> Your version of the text is definitely better than mine, I am all for
> >> using it.
> >>
> >>
> >>
> >> As for where the clarifying text could be inserted, I see two options:
> >>
> >>     - A common “Applicability Statement” section (there is no such
> section
> >>     in the draft)
> >>
> >>
> >>     -
> >>     - A dedicated section on relationship between TI-LFA and
> micro-loops.
> >>
> >>      Gyan> I think this option would  be best.  This would fix the
> existing
> >> gap on uLoop.  I did mention but not sure if possible- as TI-LFA and
> uLoop
> >> are tightly coupled as a overall post convergence solution is it
> possible
> >> to combine the drafts and issue another WGLC.  (Question for authors)
> >>
> >> In any case, I defer to you and the rest of the authors to decide what,
> if
> >> anything should be done for clarifying the relationship between TI-LFA
> and
> >> micro-loops.
> >>
> >>
> >>
> >>
> >>
> >> Regards,
> >>
> >> Sasha
> >>
> >>
> >>
> >> *From:* bruno.decraene@orange.com <bruno.decraene@orange.com>
> >> *Sent:* Monday, October 23, 2023 3:27 PM
> >> *To:* Alexander Vainshtein <Alexander.Vainshtein@rbbn.com>
> >> *Cc:* rtgwg@ietf.org; rtgwg-chairs <rtgwg-chairs@ietf.org>;
> >> draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org; Stewart Bryant <
> >> stewart.bryant@gmail.com>
> >> *Subject:* [EXTERNAL] RE: draft-ietf-rtgwg-segment-routing-ti-lfa : A
> >> simple pathological network fragment
> >>
> >>
> >>
> >> Sasha,
> >>
> >>
> >>
> >> Thanks for the summary and the constructive proposal.
> >>
> >> Speaking for myself, this makes sense and I agree.
> >>
> >>
> >>
> >> Ø  TI-LFA is a local operation applied by the PLR when it detects
> failure
> >> of one of its local links. As such,  it does not affect:
> >>
> >> o   Micro-loops that appear – or do not appear –on the paths to the
> >> destination that do not pass thru TI-LFA paths
> >>
> >>
> >>
> >> As an editorial comment, depending on where such text would be
> inserted, I
> >> would propose the following change:
> >>
> >> OLD: Micro-loops that appear – or do not appear –
> >>
> >> NEW: Micro-loops that appear – or do not appear – as part of the
> >> distributed IGP convergence [RFC5715]
> >>
> >>
> >>
> >> Motivation: some reader could wrongly understand that such micro-loops
> are
> >> caused by TI-LFA
> >>
> >>
> >>
> >> Thanks,
> >>
> >> Regards,
> >>
> >> --Bruno
> >>
> >>
> >>
> >> Orange Restricted
> >>
> >> *From:* Alexander Vainshtein <Alexander.Vainshtein@rbbn.com>
> >> *Sent:* Sunday, October 22, 2023 4:21 PM
> >> *To:* DECRAENE Bruno INNOV/NET <bruno.decraene@orange.com>; Stewart
> >> Bryant <stewart.bryant@gmail.com>
> >> *Cc:* rtgwg@ietf.org; rtgwg-chairs <rtgwg-chairs@ietf.org>;
> >> draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org
> >> *Subject:* RE: draft-ietf-rtgwg-segment-routing-ti-lfa : A simple
> >> pathological network fragment
> >> *Importance:* High
> >>
> >>
> >>
> >> Bruno, Stewart and all,
> >>
> >> I think that most of the things about TI-LFA and micro-loops have been
> >> said already (if in a slightly different context)  and are mainly
> >> self-evident.
> >>
> >> However, I share the feeling that somehow the relationship between
> TI-LFA
> >> and micro-loop avoidance has become somewhat muddled.
> >>
> >>
> >>
> >> Therefore, I would like to suggest adding some text to the TI-LFA draft
> >> that clarifies this relationship, e.g., along the following lines:
> >>
> >> 1.       TI-LFA is a local operation applied by the PLR when it detects
> >> failure of one of its local links. As such,  it does not affect:
> >>
> >> a.       Micro-loops that appear – or do not appear –on the paths to the
> >> destination that do not pass thru TI-LFA paths
> >>
> >>
> >> i.      As explained in RFC 5714, such micro-loops may result in the
> >> traffic not reaching the PLR and therefore not following TI-LFA paths
> >>
> >>
> >> ii.      Segment Routing may be used for prevention of such micro-loops
> >> as described in the micro-loop avoidance draft
> >>
> >> b.       Micro-loops that appear – or do not appear - when the failed
> >> link is repaired (*aside: the need for this line is based on personal
> >> experience**☹*)
> >>
> >> 2.       TI-LFA paths are loop-free. What’s more, they follow the
> >> post-convergence paths, and, therefore, not subject to micro-loops due
> to
> >> difference in the IGP convergence times of the nodes thru which they
> pass
> >>
> >> 3.       TI-LFA paths are applied from the moment the PLR detects
> failure
> >> of a local link and until IGP convergence at the PLR is completed.
> >> Therefore, early (relative to the other nodes) IGP convergence at the
> PLR
> >> and the consecutive ”early” release of TI-LFA paths may cause
> micro-loops,
> >> especially if these paths have been computed using the methods
> described in
> >> Section 6.2, 6.3 or 6.4 of the draft. One of the possible ways to
> prevent
> >> such micro-loops is local convergence delay (RFC 8333).
> >>
> >> 4.       TI-LFA procedures are complementary to application of any
> >> micro-loop avoidance procedures in the case of link or node failure:
> >>
> >> a.       Link or node failure requires some urgent action to restore the
> >> traffic that passed thru the failed resource. TI-LFA paths are
> pre-computed
> >> and pre-installed and therefore suitable for urgent recovery
> >>
> >> b.       The paths used in the micro-loop avoidance procedures typically
> >> cannot be pre-computed.
> >>
> >>
> >>
> >> Hopefully these notes would be useful.
> >>
> >>
> >>
> >> Regards,
> >>
> >> Sasha
> >>
> >>
> >>
> >> *From:* rtgwg <rtgwg-bounces@ietf.org> *On Behalf Of *
> >> bruno.decraene@orange.com
> >> *Sent:* Thursday, October 19, 2023 7:34 PM
> >> *To:* Stewart Bryant <stewart.bryant@gmail.com>
> >> *Cc:* rtgwg@ietf.org; rtgwg-chairs <rtgwg-chairs@ietf.org>;
> >> draft-ietf-rtgwg-segment-routing-ti-lfa@ietf.org
> >> *Subject:* [EXTERNAL] RE: draft-ietf-rtgwg-segment-routing-ti-lfa : A
> >> simple pathological network fragment
> >>
> >>
> >>
> >> Hi Stewart,
> >>
> >>
> >>
> >> I agree with you on the technical points, so the first part of your
> email
> >> up to “So I think”.
> >>
> >>
> >>
> >> But I don’t quite follow why you want to mix IGP Convergence issues with
> >> this Fast ReRoute Solution.
> >>
> >> To quote RFC 5714 « IP Fast Reroute Framework”
> >>
> >>
> >>
> >> In order to reduce packet disruption times to a duration commensurate
> >>
> >>     with the failure detection times, two mechanisms may be required:
> >>
> >>
> >>
> >>     a.  A mechanism for the router(s) adjacent to the failure to rapidly
> >>
> >>         invoke a repair path, which is unaffected by any subsequent re-
> >>
> >>         convergence.
> >>
> >>
> >>
> >>     b.  In topologies that are susceptible to micro-loops, a micro-loop
> >>
> >>         control mechanism may be required [RFC5715
> >> <https://datatracker.ietf.org/doc/html/rfc5715>].
> >>
> >>
> >>
> >>     Performing the first task without the second may result in the
> repair
> >>
> >>     path being starved of traffic and hence being redundant.
> >>
> >>
> >>
> >> https://datatracker.ietf.org/doc/html/rfc5714#section-4
> >>
> >>
> >>
> >> I would assume that you agree with the above (as you are an author of
> this RFC and my guess would be that you wrote that text)
> >>
> >>
> >>
> >> My point is that there are two different mechanisms involved, in two
> different time periods:
> >>
> >> -     Fast ReRoute (“a”): this is the scope of
> draft-ietf-rtgwg-segment-routing-ti-lfa
> >>
> >> o   Timing: from detection time , to start of the IGP convergence
> >>
> >> -     IGP Micro-loop avoidance (“b”)
> >>
> >> o   Timing: from start of IGP convergence to end of IGP convergence
> >>
> >>
> >>
> >> The scope of draft-ietf-rtgwg-segment-routing-ti-lfa is FRR / “a”. IGP
> >> micro-loop is out of scope. Other documents are proposing solutions for
> >> this. (and for those Micro-loop documents, FRR is similarly out of
> scope)
>