[Teas] Re: Comments abount rfc2205 Resource ReSerVation Protocol (RSVP)

Vishnu Pavan Beeram <vishnupavan@gmail.com> Fri, 25 October 2024 09:09 UTC

Return-Path: <vishnupavan@gmail.com>
X-Original-To: teas@ietfa.amsl.com
Delivered-To: teas@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 1B026C169423 for <teas@ietfa.amsl.com>; Fri, 25 Oct 2024 02:09:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -2.102
X-Spam-Level:
X-Spam-Status: No, score=-2.102 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DC_PNG_UNO_LARGO=0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, FREEMAIL_FROM=0.001, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_BLOCKED=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (2048-bit key) header.d=gmail.com
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id RBBjv4IfEBAf for <teas@ietfa.amsl.com>; Fri, 25 Oct 2024 02:09:43 -0700 (PDT)
Received: from mail-oa1-x34.google.com (mail-oa1-x34.google.com [IPv6:2001:4860:4864:20::34]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature ECDSA (P-256) server-digest SHA256) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id D7829C151993 for <teas@ietf.org>; Fri, 25 Oct 2024 02:09:37 -0700 (PDT)
Received: by mail-oa1-x34.google.com with SMTP id 586e51a60fabf-288dfdaf833so1009642fac.3 for <teas@ietf.org>; Fri, 25 Oct 2024 02:09:37 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729847377; x=1730452177; darn=ietf.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=r0bobRda6bkwz6mis5TSr96CCOSb6E91eAkVybQBw3Y=; b=ett5EmQV7pCs90COGZlQ+2u57CUoWpDMun+WpSl/B1nvVNylOxFxkuB3DvQ+8j23qf LjgMa6hzcYi7mAtc4rWqXpQlAd/46MO2hFWYXx9Got6coon+r0/1xdgS+kowuHop63vN XF8DHzfuKFgqEoi/4fDZVs6/cVEaxe1QH5mAe2sMSfvaMYnZj94Nfh84Mfruaj4g0dNv /TCzky+3/Sfa739REm4D0bdAY9r499wqrqr8G0CeI7iqr8GjaB7/n6MdtjI0G0ZzB64i K8d2W1KpdTFZbedEfyapKQNyuver68ttW8Om7DwsqCf06fgOJMblYoRY5ZAnFyUcZGu1 6ZcA==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729847377; x=1730452177; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=r0bobRda6bkwz6mis5TSr96CCOSb6E91eAkVybQBw3Y=; b=cOGdPyqAiDiZVWpH6N3hZOs8l9WDFivJz03dYxnQ71wVw6HP0feAyhtsG/Jp5wwjFN T9w+CvWH5OvDehHdYJLdvYfQFQ7aihZgFoo00VPWlHgFx5G+yqYk+1k4Bc2NmWDUBCYK a2Ck7yzjcSY+Tc71z2Nx4bu4LlzAfH3UlA1Gryl/2onqDZ9tV7wlVjBQt1inItZbxFnT noI+eHen1pgSYIPMkyl4M7b++C92ySpE38ybRGYYL6+w2vNpGHhAR4zc4X4+SaYVT5SP z/2hdnojXoBGIx+atx9wfbes5sCLoLRj75zRAcWriV/HpqpVOSLXlgxFGXrbq/87OPyl s63g==
X-Forwarded-Encrypted: i=1; AJvYcCVZ1GXf/gqgV7RYn5ovt+CJY4EepnHiL3UhWbz+uCVSLuiop+xlhTZl+gUnZuf5uJpM8iMZ@ietf.org
X-Gm-Message-State: AOJu0YxnHUebSfwQKVND+2KoGmh5hsrP9X6hPYmPSz2sUNb8p2mM+sj5 xrJ1VE0o2dwVx35Gqt2OdxphyYE4gaBCo6FvEXaC8S1jlKL5bMh4o38siSQ7DxBR45NljeOurfC nUtWkrwHW/d4CEi/42FTndIjs9kZGTtcYFZJEYA==
X-Google-Smtp-Source: AGHT+IGEeCXXD+o3cGvwRWmZbaMj8htleVleHa+WoeXufW+Sf6cmJarZXaRNRJP8ajuWcw27VuiFmoqZfNfYzB8JwSE=
X-Received: by 2002:a05:6870:194f:b0:277:d7f1:db53 with SMTP id 586e51a60fabf-28ced299470mr4933435fac.17.1729847376452; Fri, 25 Oct 2024 02:09:36 -0700 (PDT)
MIME-Version: 1.0
References: <CA+SXWCnrL-0AbHKJo_k0RNhVP-maJQqkwfdaZfx4wo82eKYO=w@mail.gmail.com> <007301db2471$b5124760$1f36d620$@olddog.co.uk> <CA+YzgTt3pAwxUs+ZQmeyN934kVWt-tvpo5=ZinxV2We_k6MnSw@mail.gmail.com> <CA+SXWCk9V22W=_WtDWiqZX3E9CmjD98sN+W734rtrX-jr0Qstg@mail.gmail.com> <CA+YzgTu=oiR2vAL15T11P=YU0CO45+322fObC14KGSZn8biKHw@mail.gmail.com> <CA+SXWCnRi6Xn8v+ePO0XddoAGYNOCkEZhakZ6VMGCDaYL0_r_w@mail.gmail.com>
In-Reply-To: <CA+SXWCnRi6Xn8v+ePO0XddoAGYNOCkEZhakZ6VMGCDaYL0_r_w@mail.gmail.com>
From: Vishnu Pavan Beeram <vishnupavan@gmail.com>
Date: Fri, 25 Oct 2024 14:39:24 +0530
Message-ID: <CA+YzgTvhcLcM7f_m70trorij72eUqRwGFuhJZ6scfJ94ny35vA@mail.gmail.com>
To: Tuấn Anh Vũ <anhvt.hdg@gmail.com>
Content-Type: multipart/related; boundary="00000000000063d74a0625497a2f"
Message-ID-Hash: 5OFGM2263YD6PU2FQCGQK5PP6TH52NZT
X-Message-ID-Hash: 5OFGM2263YD6PU2FQCGQK5PP6TH52NZT
X-MailFrom: vishnupavan@gmail.com
X-Mailman-Rule-Misses: dmarc-mitigation; no-senders; approved; emergency; loop; banned-address; member-moderation; header-match-teas.ietf.org-0; nonmember-moderation; administrivia; implicit-dest; max-recipients; max-size; news-moderation; no-subject; digests; suspicious-header
CC: adrian@olddog.co.uk, TEAS WG <teas@ietf.org>
X-Mailman-Version: 3.3.9rc6
Precedence: list
Subject: [Teas] Re: Comments abount rfc2205 Resource ReSerVation Protocol (RSVP)
List-Id: Traffic Engineering Architecture and Signaling working group discussion list <teas.ietf.org>
Archived-At: <https://mailarchive.ietf.org/arch/msg/teas/nCwucwaW5fltZ4a1FDQHcwFZs9o>
List-Archive: <https://mailarchive.ietf.org/arch/browse/teas>
List-Help: <mailto:teas-request@ietf.org?subject=help>
List-Owner: <mailto:teas-owner@ietf.org>
List-Post: <mailto:teas@ietf.org>
List-Subscribe: <mailto:teas-join@ietf.org>
List-Unsubscribe: <mailto:teas-leave@ietf.org>

AnhVT, Hi!


If R1 adds a Notify Request Object in the PATH message, R3 could send an
RSVPNotify Message to R1 (if R1 is still reachable) with a relevant
ErrorSpec when it detects the link-down event and cleans up the state. And
R1 can reroute accordingly (without relying on R2 to send a
PathErr/ResvTear).

https://datatracker.ietf.org/doc/html/rfc3473#section-4.3

However (AFAIK), this isn’t widely supported/deployed functionality in
IP/MPLS networks.



If you are having issues with continuous interface flaps, you may want to
consider enabling interface dampening (along the lines of
https://datatracker.ietf.org/doc/html/draft-ietf-netmod-intf-ext-yang-14#section-2.2)
to reduce the impact on control-plane protocols. That said, this is outside
the scope of this mailing list.


Regards,

-Pavan

On Fri, Oct 25, 2024 at 12:43 PM Tuấn Anh Vũ <anhvt.hdg@gmail.com> wrote:

> Hi Pavan,
>
> *Since R2 is (oblivious of the link flap on R3 and) still maintaining
> state, it should eventually refresh the path state which would result in R3
> and R4 re-instantiating the LSP state. When the Resv message with the new
> label reaches R2, it would delete the existing reservation state (send a
> ResvTear) and recreate the state with the new label (and send a Resv
> upstream). And R1 would also delete and add reservation state as part of
> the ResvTear and Resv processing (and the LSP would be deemed healthy
> again). Is this not happening in your setup?*
>
> * AnhVT:* It takes a long time to clean up the reservation state based on
> timeout or refresh message. This is unacceptable for ISP services.
>
>
> *If you use RSVP Hellos per neighbor interface, R3 could update the
> instance value when the link flaps. And this change in the instance value
> can also prompt R2 to refresh state immediately (even without any Graceful
> Restart procedures coming into play) resulting in R3 and R4
> re-instantiating state again.  *
>
> *AnhVT: *RSVP direct neighbor (link neighbor) is one mechanism to bring
> down LSP when the RSVP direct neighbor changes the state from UP to DOWN.
> In my case, RSVP direct neighbor hadn't up before that because the AE was
> flapping many times before the stuck issue happened. Every time the AE
> between R2-R3 is up again, it takes 1-9s to bring up the RSVP direct
> neighbor (hello timer is 9s). Before the issue that I described in the
> first email (the LACP flap on R3 but not R2), the AE was flapping for some
> seconds (less than 9s) -->  RSVP direct neighbor hadn't up yet.
>
>
> Regards,
>
> AnhVT
>
> Vào Th 6, 25 thg 10, 2024 vào lúc 13:11 Vishnu Pavan Beeram <
> vishnupavan@gmail.com> đã viết:
>
>> AnhVT, Hi!
>>
>> Thanks for the additional details. The key detail that was missing in
>> your earlier email was that the link comes back up immediately on R3.
>>
>>
>>
>> Since R2 is (oblivious of the link flap on R3 and) still maintaining
>> state, it should eventually refresh the path state which would result in R3
>> and R4 re-instantiating the LSP state. When the Resv message with the new
>> label reaches R2, it would delete the existing reservation state (send a
>> ResvTear) and recreate the state with the new label (and send a Resv
>> upstream). And R1 would also delete and add reservation state as part of
>> the ResvTear and Resv processing (and the LSP would be deemed healthy
>> again). Is this not happening in your setup?
>>
>>
>>
>> If you use RSVP Hellos per neighbor interface, R3 could update the
>> instance value when the link flaps. And this change in the instance value
>> can also prompt R2 to refresh state immediately (even without any Graceful
>> Restart procedures coming into play) resulting in R3 and R4
>> re-instantiating state again.
>>
>>
>> Hope this helps.
>>
>>
>> Regards,
>>
>> -Pavan
>>
>> On Thu, Oct 24, 2024 at 1:31 PM Tuấn Anh Vũ <anhvt.hdg@gmail.com> wrote:
>>
>>> Hi,
>>> Thanks for all your answers, please find my view below:
>>> *I./ Hi Adrian:*
>>>
>>> *There are two questions that arise…*
>>>
>>>    1. *Why isn’t R2 able to notice? Presumably the link failure
>>>    detection is relying on a lower layer (L2 or L1) failure indication, and
>>>    that is not happening. The answer to this is to run some other link failure
>>>    detection mechanism such as BFD.*
>>>
>>> *Such a mechanism would allow R2 to declare the link down and possibly
>>> re-route/repair the LSP via R5, or notify the head end (R1) to let it
>>> re-route.*
>>>
>>> *AnhVT:* We use both LACP (1sx3) and micro BFD(300msx3) for this link
>>> but some how LACP timeout on R3 first (this link is using
>>> Transmission System, so the physical link is not down) and that triggers
>>> micro BFD sends Admin down notify to remote (R2). This notification brings
>>> down micro BFD on R2 but not LACP client. This is expected behavior noticed
>>> in BFD RFC. After 2s, the link is stable again. Because of that, R2 does
>>> not know LACP was flapped on R3 side.
>>>
>>>    2. *How could R3 let R2 know that the LSP has been torn down? The
>>>    answer is “by sending a PathErr or ResvTear or Notification”. In general,
>>>    those messages are sent hop by hop, and so they would fail to be routed on
>>>    the failed link R3-R2, however, it is possible to IP-tunnel to
>>>    direct-address RSVP packets so that they would be IP-routed to R2 (or
>>>    direct to R1) via R5.*
>>>
>>>           * AnhVT: *Could you give me the document relative to this
>>> behavior? I read some RSVP RFC but I don't observe this behavior.
>>> *II./ Hi Pavan:*
>>>
>>>    - *Stale state cleanup based on soft state time out (RFC2205): Since
>>>    the link is down, the reservation state isn’t getting refreshed. So, when
>>>    the reservation state times out (in about 157.5 secs for a refresh-interval
>>>    of 30 secs), R2 is expected to clean-up the reservation state and signal a
>>>    ResvTear to the ingress*
>>>
>>>           *AnhVT: *It takes a long time to clean up the reservation
>>> state based on timeout. More than 2 minutes of blackhole traffic. This is
>>> unacceptable for ISP services.
>>>
>>>
>>>    - Use of RSVP Hello Session based on the Node-ID (RFC4558) for
>>>    detection of RSVP-TE signaling adjacency failure: If there was an RSVP
>>>    Hello session maintained between R2 and R3, R2 would be able to couple the
>>>    state of the LSP with the state of signaling adjacency. And when the
>>>    signaling adjacency failure is detected (Hello State timed out -- for a 9
>>>    sec hello interval, the time out takes 31.5 secs), R2 would clean up the
>>>    reservation state and signal a ResvTear to the ingress. This option can be
>>>    used to clean up stale state when long refresh intervals are used.
>>>    - *AnhVT: *The LACP interface on R3 just flaps for 2s, so the
>>>    node-hello can not work in this case. And 31.5 s is a long time too. I
>>>    think we need some way faster. R3 should send Reserve Tear to R2 through
>>>    IGP link (R3->R5->R2).
>>>
>>>
>>> *III./ Hi Tarek*
>>> *RSVP PathErr can be used to propagate errors upstream – there’s
>>> Path_State_Removed that RFC3473 introduced to also notify that state has
>>> been removed. Does that address your need?*
>>> *AnhVT: *Let me check R3 send PathErr to R2 or not. As I know, R2 will
>>> not bring down LSP even if R2 receives PathErr from R3. Base on
>>> https://datatracker.ietf.org/doc/html/rfc2209
>>> [image: image.png]
>>>
>>> Regards,
>>> AnhVT
>>>
>>>
>>>
>>> Vào Th 4, 23 thg 10, 2024 vào lúc 01:23 Vishnu Pavan Beeram <
>>> vishnupavan@gmail.com> đã viết:
>>>
>>>> AnhVT, Hi!
>>>>
>>>>
>>>>
>>>> Since you are referring to an LSP in an IP/MPLS network, I’m assuming
>>>> that you are using in-band RSVP signaling. I’m also assuming that this is
>>>> an LSP that does not have any form of local-protection enabled.
>>>>
>>>>
>>>>
>>>> When R3 detects an upstream link-down event, it cleans up the local
>>>> path state and sends a PathTear downstream -- in this scenario, the onus is
>>>> not on R3 to notify the ingress of this outage. The typical expected
>>>> behavior on R2 is to detect the downstream link-down event and send a
>>>> PathErr to the ingress (signaled hop-by-hop) of the LSP. R2 would also
>>>> clean-up the reservation state and send a ResvTear to the ingress (again,
>>>> signaled hop-by-hop). If R2 is not able to detect the link-down event for
>>>> some reason (and no other link state detection mechanism like BFD is
>>>> available), there are a couple of control-plane options that RSVP already
>>>> provides to clean up state (in due course of time) and bring down the LSP:
>>>>
>>>>    - Stale state cleanup based on soft state time out (RFC2205): Since
>>>>    the link is down, the reservation state isn’t getting refreshed. So, when
>>>>    the reservation state times out (in about 157.5 secs for a refresh-interval
>>>>    of 30 secs), R2 is expected to clean-up the reservation state and signal a
>>>>    ResvTear to the ingress
>>>>    - Use of RSVP Hello Session based on the Node-ID (RFC4558) for
>>>>    detection of RSVP-TE signaling adjacency failure: If there was an RSVP
>>>>    Hello session maintained between R2 and R3, R2 would be able to couple the
>>>>    state of the LSP with the state of signaling adjacency. And when the
>>>>    signaling adjacency failure is detected (Hello State timed out -- for a 9
>>>>    sec hello interval, the time out takes 31.5 secs), R2 would clean up the
>>>>    reservation state and signal a ResvTear to the ingress. This option can be
>>>>    used to clean up stale state when long refresh intervals are used.
>>>>
>>>>
>>>> Hope this helps.
>>>>
>>>>
>>>>
>>>> Regards,
>>>>
>>>> -Pavan
>>>>
>>>> On Tue, Oct 22, 2024 at 4:34 PM Adrian Farrel <adrian@olddog.co.uk>
>>>> wrote:
>>>>
>>>>> Hi Anh,
>>>>>
>>>>>
>>>>>
>>>>> [Redirecting from MPLS to TEAS as suggested by Tony Li]
>>>>>
>>>>>
>>>>>
>>>>> I think that (given you mention LSPs) you re talking about RSVP-TE
>>>>> (RFC 3209) not plain old RFC 2205 RSVP.
>>>>>
>>>>>
>>>>>
>>>>> In your example, the link R2-R3 has failed in a way that R3 is aware
>>>>> of the failure, but R2 is not aware.
>>>>>
>>>>>
>>>>>
>>>>> There are two questions that arise…
>>>>>
>>>>>    1. Why isn’t R2 able to notice? Presumably the link failure
>>>>>    detection is relying on a lower layer (L2 or L1) failure indication, and
>>>>>    that is not happening. The answer to this is to run some other link failure
>>>>>    detection mechanism such as BFD.
>>>>>
>>>>> Such a mechanism would allow R2 to declare the link down and possibly
>>>>> re-route/repair the LSP via R5, or notify the head end (R1) to let it
>>>>> re-route.
>>>>>
>>>>>    2. How could R3 let R2 know that the LSP has been torn down? The
>>>>>    answer is “by sending a PathErr or ResvTear or Notification”. In general,
>>>>>    those messages are sent hop by hop, and so they would fail to be routed on
>>>>>    the failed link R3-R2, however, it is possible to IP-tunnel to
>>>>>    direct-address RSVP packets so that they would be IP-routed to R2 (or
>>>>>    direct to R1) via R5.
>>>>>
>>>>>
>>>>>
>>>>> Cheers,
>>>>>
>>>>> Adrian
>>>>>
>>>>>
>>>>>
>>>>> *From:* Tuấn Anh Vũ <anhvt.hdg@gmail.com>
>>>>> *Sent:* 22 October 2024 04:45
>>>>> *To:* mpls@ietf.org
>>>>> *Subject:* [mpls] Comments abount rfc2205 Resource ReSerVation
>>>>> Protocol (RSVP)
>>>>>
>>>>>
>>>>>
>>>>> Hi IETF team,
>>>>>
>>>>> I'm AnhVT from the SVTech company in VietNam, I have experienced some
>>>>> RSVP issues in the IPv4 MPLS network.
>>>>>
>>>>> I suspect that RSVP has a point that needs to be enhanced. I
>>>>> describe this point below:
>>>>>
>>>>>
>>>>>
>>>>> I./ Topology:
>>>>>
>>>>>     ---------LSP-------->
>>>>>
>>>>>     R1----R2----R3-----R4
>>>>>
>>>>>           |    /
>>>>>
>>>>>           |  /
>>>>>
>>>>>            R5
>>>>>
>>>>> II./ Issue
>>>>>
>>>>> 1./ Because of some bugs (exp: R3 experiences a flap link between
>>>>> R3-R2, but R2 does not recognize the interface flap), R3 indicates that LSP
>>>>> is down, then it deletes the LSP state and sends the PathTear downstream to
>>>>> R4.
>>>>>
>>>>> 2./Because R2 does not recognize the interface flap, R2 still keeps
>>>>> it available. It does not know that the LSP should be deleted.
>>>>>
>>>>> 3./ Due to 1./ and 2./ R1 does not know that the LSP is stuck because
>>>>> R3 and R4 deleted the LSP state, and R1 continues forwarding traffic to the
>>>>> LSP, This makes the service down.
>>>>>
>>>>>
>>>>>
>>>>> III./ My comment
>>>>>
>>>>> I think that RSVP needs a mechanic so that R3 signals to R2 to ensure
>>>>> that R2 knows that R3 deleted the LSP. Based on that signal, R2 will bring
>>>>> down the LSP and continue to send Reserve Tear to R1.
>>>>>
>>>>>
>>>>>
>>>>> I hope that you take a look at my comment.
>>>>>
>>>>>
>>>>>
>>>>> Regards,
>>>>>
>>>>> AnhVT
>>>>> _______________________________________________
>>>>> Teas mailing list -- teas@ietf.org
>>>>> To unsubscribe send an email to teas-leave@ietf.org
>>>>>
>>>>