Re: [spring] WG Adoption Call - draft-cheng-rtgwg-srv6-multihome-egress-protection (02/09/24 - 02/24/24)

Weiqiang Cheng <chengweiqiang@chinamobile.com> Wed, 28 February 2024 06:00 UTC

Return-Path: <chengweiqiang@chinamobile.com>
X-Original-To: spring@ietfa.amsl.com
Delivered-To: spring@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 4B2D3C14CF1D; Tue, 27 Feb 2024 22:00:18 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.895
X-Spam-Level:
X-Spam-Status: No, score=-1.895 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HTML_MESSAGE=0.001, RCVD_IN_ZEN_BLOCKED_OPENDNS=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, T_KAM_HTML_FONT_INVALID=0.01, T_SCC_BODY_TEXT_LINE=-0.01, URIBL_BLOCKED=0.001, URIBL_DBL_BLOCKED_OPENDNS=0.001, URIBL_ZEN_BLOCKED_OPENDNS=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([50.223.129.194]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id XuupzOVB8BZ0; Tue, 27 Feb 2024 22:00:16 -0800 (PST)
Received: from cmccmta2.chinamobile.com (cmccmta2.chinamobile.com [111.22.67.135]) by ietfa.amsl.com (Postfix) with ESMTP id 8656BC157931; Tue, 27 Feb 2024 21:59:27 -0800 (PST)
X-RM-TagInfo: emlType=0
X-RM-SPAM-FLAG: 00000000
Received: from spf.mail.chinamobile.com (unknown[10.188.0.87]) by rmmx-syy-dmz-app07-12007 (RichMail) with SMTP id 2ee765decbba36e-b4bb4; Wed, 28 Feb 2024 13:59:23 +0800 (CST)
X-RM-TRANSID: 2ee765decbba36e-b4bb4
X-RM-TagInfo: emlType=0
X-RM-SPAM-FLAG: 00000000
Received: from chengweiqiang (unknown[223.104.68.253]) by rmsmtp-syy-appsvr10-12010 (RichMail) with SMTP id 2eea65decbb9100-c5c8c; Wed, 28 Feb 2024 13:59:23 +0800 (CST)
X-RM-TRANSID: 2eea65decbb9100-c5c8c
Date: Wed, 28 Feb 2024 13:59:22 +0800
From: Weiqiang Cheng <chengweiqiang@chinamobile.com>
To: "bruno.decraene" <bruno.decraene@orange.com>, "yingzhen.ietf" <yingzhen.ietf@gmail.com>
Cc: "ketant.ietf" <ketant.ietf@gmail.com>, rtgwg <rtgwg@ietf.org>, draft-cheng-rtgwg-srv6-multihome-egress-protection <draft-cheng-rtgwg-srv6-multihome-egress-protection@ietf.org>, rtgwg-chairs <rtgwg-chairs@ietf.org>, spring-chairs <spring-chairs@ietf.org>, spring <spring@ietf.org>
References: <CABY-gOMQ=LaECWJsJHsdKX7i+BUsiX=LF5b5ZPMVp=3qQjZ8Mg@mail.gmail.com>, <CAH6gdPyuWV=xvDerDCtXnD1T5CGymsm+b1i-idRGEs1w9aui=A@mail.gmail.com>, <CABY-gOPDLs6j+YPSYhbwnvvkfTi1VyPN8Vr6XWs9oy28cxr6Mw@mail.gmail.com>, <AS2PR02MB88398552B14D8CBE45E57BB8F05A2@AS2PR02MB8839.eurprd02.prod.outlook.com>, <CABY-gOO1F6CUDC8kmHV1EK894gv_YvMVWn0swo29K1GORhxETQ@mail.gmail.com>, <AS2PR02MB8839D4825A93A6991ED37BC0F0592@AS2PR02MB8839.eurprd02.prod.outlook.com>
X-Priority: 3
X-GUID: EE629428-2BAF-4CE2-92D0-B8BD2737EFB6
X-Has-Attach: no
X-Mailer: Foxmail 7.2.25.213[cn]
Mime-Version: 1.0
Message-ID: <20240228135922442303111@chinamobile.com>
Content-Type: multipart/alternative; boundary="----=_001_NextPart717142557080_=----"
Archived-At: <https://mailarchive.ietf.org/arch/msg/spring/83D3QJy_E9JjlBet4-GoRlrQO9A>
Subject: Re: [spring] WG Adoption Call - draft-cheng-rtgwg-srv6-multihome-egress-protection (02/09/24 - 02/24/24)
X-BeenThere: spring@ietf.org
X-Mailman-Version: 2.1.39
Precedence: list
List-Id: "Source Packet Routing in NetworkinG \(SPRING\)" <spring.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/spring>, <mailto:spring-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/spring/>
List-Post: <mailto:spring@ietf.org>
List-Help: <mailto:spring-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/spring>, <mailto:spring-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 28 Feb 2024 06:00:18 -0000

Hi Bruno and Yingzhen,
Thank you very much for your insightful comments and guidance.
Please see co-authors' feedback inline [co-author]. 

Thanks,
Weiqiang Cheng


 
From: bruno.decraene
Date: 2024-02-27 16:12
To: Yingzhen Qu
CC: Ketan Talaulikar; RTGWG; draft-cheng-rtgwg-srv6-multihome-egress-protection; rtgwg-chairs; spring-chairs@ietf.org; spring@ietf.org
Subject: Re: [spring] WG Adoption Call - draft-cheng-rtgwg-srv6-multihome-egress-protection (02/09/24 - 02/24/24)
Hi Yingzhen,
 
Thank you for your answers and clarification. That really helps.
Please see inline [Bruno]
 
From: Yingzhen Qu <yingzhen.ietf@gmail.com> 
Sent: Tuesday, February 27, 2024 3:42 AM
To: DECRAENE Bruno INNOV/NET <bruno.decraene@orange.com>
Cc: Ketan Talaulikar <ketant.ietf@gmail.com>; RTGWG <rtgwg@ietf.org>; draft-cheng-rtgwg-srv6-multihome-egress-protection <draft-cheng-rtgwg-srv6-multihome-egress-protection@ietf.org>; rtgwg-chairs <rtgwg-chairs@ietf.org>; spring-chairs@ietf.org; spring@ietf.org
Subject: Re: WG Adoption Call - draft-cheng-rtgwg-srv6-multihome-egress-protection (02/09/24 - 02/24/24)
 
Hi Bruno,
 
Thank you for the feedback, really appreciate it.
 
Let me try to answer your questions with my understanding, and the authors please chime in.
 
Thanks,
Yingzhen
 
On Mon, Feb 26, 2024 at 5:07 AM <bruno.decraene@orange.com> wrote:
Dear Yingzhen,

At your request, I have quickly parsed the draft.

It's not completely clear to me how the solution works given that the terminology used is a bit loose.

2 questions on the terminology:

1) "protection" vs "restoration". The document largely uses the term "protection", in particular in its title. This usually assumes that protection is precomputed, local to the penultimate node (before the failure) and hence can be fast.
I'm assuming that "protection" is indeed meant. Please correct me if this is wrong. In which case, the node doing the protection is usually called PLR and is reacting before the IGP convergence. If so, it would be good for the document to reflect this (use the term PLR, remove the reference to IGP fast convergence)
(On the other hand, if would meant restoration following IGP convergence, the gain seems limited to me as the ingress would equally be able to react to IGP convergence and with PIC edge it would do it fast)
[Yingzhen]:  "Protection" is the right term. 
 
[Bruno] OK. Therefore that’s before the IGP convergence. Could you please remove the references to IGP convergences?

[co-authors]  OK. We will be updated in the new version.

2) "Penultinate node" vs "penultimate Endpoint."
RFC 8754 defines different type of nodes. https://datatracker.ietf.org/doc/html/rfc8754#section-3
In particular, a transit node is a regular IPv6 router which does not process the SRH. While a EndPoint is a node receiving an IPv6 packet where the destination address of that packet is locally configured as a segment or local interface

Can you please clarify whether your Penultimate Endpoint (§3.3) is a Transit Node or an SR Segment Endpoint Node?

- If the Penultimate Endpoint (§3.3) is a Transit Node, then as per RFC8200 it's not allowed to process the SRH. https://datatracker.ietf.org/doc/html/rfc8200#section-4 Hence your proposal would not be compliant with RFC 8200
- If the Penultimate Endpoint (§3.3) is an SR Segment Endpoint Node, the Ingress needs to specifically adds a Segment of this node. (typically End or End.X). If so please clarify this in the document (in particular in§3.1). Note that by doing so, you are adding a new point of failure (the failure of this Penultimate Endpoint). How do you protect from this added case of failure? If you don't, I would argue that the gain is debatable as you replace one type of failure (PE failure) by another type of failure (P failure).
[Yingzhen]: This should be "penultimate SR segment endpoint". A transit node may not have the capability to process a SRH. With that being said, the penultimate SR endpoint may be several hops away from the PE, and this requires some failure detection mechanisms, such as multi-hop BFD.  
 
[Bruno] OK. Could you please clarify this in the draft? (the name and the need for multi-hop BFD hence a discussion about scaling and probably configuration.)
Do we agree that in the absence of an SR-Policy, that penultimate SR segment endpoint is in fact the ingress PE hence nothing changed? If so, could you please clarify in the draft that this only applies to traffic using SR-policies?

[co-authors]  This draft is not limited to the SR-Policy scenario. If there is no SR-Policy configured and only two SIDs exist, namely primary and backup, any intermediate node can bypass the failed tail node.
This can be enabled using a command control to activate this feature, or by extending a flag in the SRH header to indicate that skipping can be performed.
We will update the description in the next version.
 

I have another question on §3.3
How does the penultimate Endpoint know that it can/needs to perform the new behavior? My guess would be by looking at the next SID (the one from the egress) and discovering that the behavior of this SID is End.D* with PSD. That would seem to require this P node to be aware of all VPN routes, which is typically not the case, frown upon and does not scale well as the P nodes would have 10s of PE (if not 100).
[Yingzhen]: my understanding is that this protection mechanism is not to be deployed for all PEs, but only a subset of them. Otherwise I agree with you that it doesn't scale.
 
[Bruno] OK. Could you please clarify in the draft that this solution requires, on the penultimate SR segment endpoint, i.e., a P node, the knowledge of all VPN routes of the nominal PE which need to be protected by backup PE (and in the absence of configuration, this likely requires this P to have the knowledge, in the control plane, of all VPN routes).
Plus given that the dataplane needs to check the next SID in the SRH, it also requires the knowledge of the protected routes in the dataplane/FIB. If so, it’s not clear to me what’s the benefit of adding the backup SID in the SRH, since the P node needs to have these states in its FIB whatever so could “replace” the ultimate SID with this knowledge.  Given that this is the core of this draft, I’m not sure to get the key benefit of this solution.

[co-authors] There is no need to store VPN routes, and the control plane does not need to add entries. In the P node, using a mechanism similar to [I-D.ietf-spring-segment-protection-sr-te-paths], when the next SID is unreachable, the node can skip the unreachable SID and forward to the backup SID. This behavior can be controlled through configuration or by adding a flag in the SRH header to indicate whether to skip unreachable nodes.
We can clarify the point in the new version. 

 
Thanks,
--Bruno
 
 
On a side note, the abstract seems a bit short to me.

So thanks for clarifying the document,
Regards,
--Bruno

> 
> -----Original Message-----
> From: Yingzhen Qu <yingzhen.ietf@gmail.com>
> Sent: Sunday, February 25, 2024 6:44 AM
> To: Ketan Talaulikar <ketant.ietf@gmail.com>; spring-chairs@ietf.org
> Cc: RTGWG <rtgwg@ietf.org>; spring@ietf.org; rtgwg-chairs <rtgwg-chairs@ietf.org>; draft-cheng-rtgwg-srv6-multihome-egress-protection <draft-cheng-rtgwg-srv6-multihome-egress-protection@ietf.org>
> Subject: Re: WG Adoption Call - draft-cheng-rtgwg-srv6-multihome-egress-protection (02/09/24 - 02/24/24)
> 
> Dear SPRING WG and chairs,
> 
> I'd like to bring your attention to this adoption call happening in the RTGWG WG.
> 
> The draft describes a SRv6 egress node protection mechanism in multi-home scenarios. As Ketan has commented in his email below the proposal requires a P router to process SRH with new endpoint behavior.
> 
> We'd like to get your comments about the proposed extensions. Please send your reply to both the SPRING and RTGWG mailing lists.
> 
> Thanks,
> Yingzhen
> 
> On Wed, Feb 21, 2024 at 8:06 AM Ketan Talaulikar <ketant.ietf@gmail.com>
> wrote:
> 
> > Hi Yingzhen/All,
> >
> > I have some concerns regarding the adoption of this document.
> >
> >
> >    - Do we need these different solutions?
> >
> > KT> No. There is one common author for both these drafts who is also
> > KT> from
> > a vendor. I hope that person is also able to evaluate implementation
> > aspects and pick one solution.
> > KT> Does the adoption of this solution make the other draft "dead"?
> >
> >    - Technical merits and drawbacks of each solution
> >
> > KT> The existing WG draft needs IGP protocol extensions and its
> > implementation is very complex (as stated in the document under
> > adoption)
>
____________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.
____________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.