Re: [Teas] Alia Atlas' Discuss on draft-ietf-teas-gmpls-lsp-fastreroute-10: (with DISCUSS and COMMENT)
"Rakesh Gandhi (rgandhi)" <rgandhi@cisco.com> Wed, 02 August 2017 22:07 UTC
Return-Path: <rgandhi@cisco.com>
X-Original-To: teas@ietfa.amsl.com
Delivered-To: teas@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id BFFA0131A4F; Wed, 2 Aug 2017 15:07:47 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -14.522
X-Spam-Level:
X-Spam-Status: No, score=-14.522 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_HI=-5, RCVD_IN_MSPIKE_H3=-0.01, RCVD_IN_MSPIKE_WL=-0.01, RP_MATCHES_RCVD=-0.001, SPF_HELO_PASS=-0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id vtCFQIH6bxfK; Wed, 2 Aug 2017 15:07:45 -0700 (PDT)
Received: from rcdn-iport-4.cisco.com (rcdn-iport-4.cisco.com [173.37.86.75]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 75D6713218B; Wed, 2 Aug 2017 15:07:45 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=19620; q=dns/txt; s=iport; t=1501711665; x=1502921265; h=from:to:cc:subject:date:message-id:references: in-reply-to:content-id:content-transfer-encoding: mime-version; bh=v+A+8qQa+0T4tkHBfCxfY8y8sZZ3f9Jx7fd2v4/wmsg=; b=koFT4OohJeERrbPEmLveZ2rxLnQzHmt85/AWWRoTLo4kS/MysFR3fasp qx9uhtlWcYI5lrWB23mV6ujFrjGLDmuYW/PdB8fSdDjlGPWwk+Fife7yK iFZzEUgZ2mKjobuMC8q2WSZjj5+tz+4CxvpxhtpJQoIDa3CxqisHSNq7O A=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0DgAAA5TIJZ/4UNJK1dGQEBAQEBAQEBAQEBBwEBAQEBgy0tZG0nB44HkAWBTIhWjVwOggQhC4UbAhqEGz8YAQIBAQEBAQEBayiFGQIBAwEBIRETIAcLEAIBCA4MAh8HAgICHwYLFRACBAENBRuJfAMVEK1TgiaDPoNwDYQPAQEBAQEBAQEBAQEBAQEBAQEBAQEBGAWBC4IdggKBTIFiLAuCcYJXgWIHARIBBx8QIQKCWTCCMQWJZQeIZoxuPAKHUYNMhByEcYINhVaKYYlagkmJVwEfOH8LdxVJEgGFOYFNAXYBh0yBI4EPAQEB
X-IronPort-AV: E=Sophos;i="5.41,313,1498521600"; d="scan'208";a="277905415"
Received: from alln-core-11.cisco.com ([173.36.13.133]) by rcdn-iport-4.cisco.com with ESMTP/TLS/DHE-RSA-AES256-SHA; 02 Aug 2017 22:07:43 +0000
Received: from XCH-ALN-019.cisco.com (xch-aln-019.cisco.com [173.36.7.29]) by alln-core-11.cisco.com (8.14.5/8.14.5) with ESMTP id v72M7hXb012356 (version=TLSv1/SSLv3 cipher=AES256-SHA bits=256 verify=FAIL); Wed, 2 Aug 2017 22:07:43 GMT
Received: from xch-aln-018.cisco.com (173.36.7.28) by XCH-ALN-019.cisco.com (173.36.7.29) with Microsoft SMTP Server (TLS) id 15.0.1210.3; Wed, 2 Aug 2017 17:07:43 -0500
Received: from xch-aln-018.cisco.com ([173.36.7.28]) by XCH-ALN-018.cisco.com ([173.36.7.28]) with mapi id 15.00.1210.000; Wed, 2 Aug 2017 17:07:43 -0500
From: "Rakesh Gandhi (rgandhi)" <rgandhi@cisco.com>
To: Alia Atlas <akatlas@gmail.com>, The IESG <iesg@ietf.org>
CC: "draft-ietf-teas-gmpls-lsp-fastreroute@ietf.org" <draft-ietf-teas-gmpls-lsp-fastreroute@ietf.org>, "teas-chairs@ietf.org" <teas-chairs@ietf.org>, "teas@ietf.org" <teas@ietf.org>, "vbeeram@juniper.net" <vbeeram@juniper.net>, DEBORAH BRUNGARD <db3546@att.com>
Thread-Topic: [Teas] Alia Atlas' Discuss on draft-ietf-teas-gmpls-lsp-fastreroute-10: (with DISCUSS and COMMENT)
Thread-Index: AQHTC8HQ5BvS+sJat0CLC5Cos2/nu6JxsRqA
Date: Wed, 02 Aug 2017 22:07:42 +0000
Message-ID: <0666F681-D84D-4AA3-8538-9A5BDDD4FAED@cisco.com>
References: <150170045164.5759.13003373677933598821.idtracker@ietfa.amsl.com>
In-Reply-To: <150170045164.5759.13003373677933598821.idtracker@ietfa.amsl.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
user-agent: Microsoft-MacOutlook/f.1d.0.161209
x-ms-exchange-messagesentrepresentingtype: 1
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [10.86.247.50]
Content-Type: text/plain; charset="utf-8"
Content-ID: <26E64CD2A8E4874E8B123864F721852A@emea.cisco.com>
Content-Transfer-Encoding: base64
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/teas/5gzX1hC1l0I-UQti4Yh80iZrIMA>
Subject: Re: [Teas] Alia Atlas' Discuss on draft-ietf-teas-gmpls-lsp-fastreroute-10: (with DISCUSS and COMMENT)
X-BeenThere: teas@ietf.org
X-Mailman-Version: 2.1.22
Precedence: list
List-Id: Traffic Engineering Architecture and Signaling working group discussion list <teas.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/teas>, <mailto:teas-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/teas/>
List-Post: <mailto:teas@ietf.org>
List-Help: <mailto:teas-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/teas>, <mailto:teas-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 02 Aug 2017 22:07:48 -0000
Hi Alia, Thank you for the detailed review of the document and your comments. Please see inline with <RG>. On 2017-08-02, 3:00 PM, "Teas on behalf of Alia Atlas" <teas-bounces@ietf.org on behalf of akatlas@gmail.com> wrote: Alia Atlas has entered the following ballot position for draft-ietf-teas-gmpls-lsp-fastreroute-10: Discuss When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.) Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html for more information about IESG DISCUSS and COMMENT positions. The document, along with other ballot positions, can be found here: https://datatracker.ietf.org/doc/draft-ietf-teas-gmpls-lsp-fastreroute/ ---------------------------------------------------------------------- DISCUSS: ---------------------------------------------------------------------- 1) In Sec 4.5.1: "The downstream PLR can assign a bypass tunnel when processing the first Path message of the protected LSP, however, it can not update the forwarding plane until it receives the Resv message containing the downstream MP label." Please explain how the downstream PLR can assign a bypass tunnel if the LSP has a loose ERO - so the downstream PLR does not know the next-next-hop that would be the MP for a node-protecting LSP. <RG> This sentence should be updated as “With exception of the ABR node protection case, where the bypass tunnel starts and ends in different domains,” 2) Sec 4.5.1: "An upstream PLR (downstream MP) SHOULD check all BYPASS_ASSIGNMENT subobjects in the Path RRO in order to assign a reverse bypass tunnel. The upstream PLR that detects a BYPASS_ASSIGNMENT subobject, selects a reverse bypass tunnel that terminates locally with the destination address and tunnel-ID from the subobject, and has a source address matching the Node-ID address." This isn't very clear - particularly given that there will be many BYPASS_ASSIGNMENT subobjects in the path RRO. The case of BYPASS_ASSIGNMENT sub-objects being removed or changed is not addressed at all. In addition, I *assume* that the failure to treat the destination IP address in the BYPASS_ASSIGNMENT as the source IP address for the upstream Bypass tunnel is an oversight? I believe that what is meant is: "An upstream PLR (downstream MP) SHOULD check all BYPASS_ASSIGNMENT sub-objects in the Path RRO to see if the destination IP address in the BYPASS_ASSIGNMENT matches an address of the upstream PLR. For each BYPASS_ASSIGNMENT sub-object that matches, the upstream PLR looks for a local bypass tunnel that has a destination matching the downstream PLR that inserted the BYPASS_ASSIGNMENT, as indicated by the Node-ID address, and the same tunnel-ID as indicated in the BYPASS_ASSIGNMENT." <RG> Your suggested text looks good. I recall that tunnel-ID is usually scoped by the address of the ingress LSR; this seems to assume that the same tunnel-ID is provided to both the downstream PLR and upstream PLR??? Alternately, I am misunderstanding - and the information in the BYPASS_ASSIGNMENT is really intended to be bypass tunnel to be used by the upstream PLR, which the downstream PLR somehow(??details, hints in the document please) knows . <RG> The PLR adds the FEC of the bypass tunnel (Source/Destination/tunnel-ID). The MP uses the FEC for lookup. Then there needs to be text to handle the case where the previous PATH message contained a particular BYPASS_ASSIGNMENT sub-object and that sub-object has been removed or changed. <RG> Yes, we can add a sentence to clarify. 3) Sec 4.5.3: "In both examples above, the upstream PLR SHOULD send a Notify message [RFC3473] with Error-code - FRR Bypass Assignment Error (value: TBA1) and Sub-code - Bypass Assignment Cannot Be Used (value: TBA2) to the downstream PLR to indicate that it cannot use the bypass tunnel assignment in the reverse direction. Upon receiving this error, the downstream PLR MAY remove the bypass tunnel assignment and select an alternate bypass tunnel if one available." This section is problematic because it creates the use of local policy when the ingress has a clear way to signal what type of protection is desired and because it provides an error message to where it will only cause pointless churn (the MP is the MP based on the type of protection desired - certainly for bypass) rather than to the ingress where it could at least be acted upon. The dynamics at time of failure also do not seem to be well considered; asymmetry is unfortunate, but worse is lack of protection. Consider the case in Example 1. If R5 suffers a node failure, then there is no protection for the upstream LSP from R6 if it prefers the link protection. It simply doesn't matter what bypass tunnel R4 picks! Sending a Notify message to R4 asking for a different tunnel is not productive. If the ingress has requested node-protection, then there is simply nothing that can be done for this topology by R5. It could be helpful to send a Notify to the ingress or have a flag set in the RESV RRO to indicate the issue, but that's about it. For the question about creating local policy, how are the SESSION_ATTRIBUTES used? Obviously, they are available in the PATH message that has the BYPASS_ASSIGNMENTs. Why would the "Node Protection Desired" flag not be relevant here? <RG> The document high-lights the issue that can occur due to different local policies on PLR and MP nodes and hence they should not be provisioned that way to avoid it☺ <RG> We can add a sentence to indicate that Session attributes flags are carried in the forward and reverse directions and can be used by the PLR and MP nodes in case there are different local policies. 4) Sec 5: " o Upstream PLR reroutes traffic upon detecting the link failure or upon receiving RSVP Path message over the bidirectional bypass tunnel. o Upstream PLR also reroutes RSVP Resv signaling after receiving RSVP Path message over the bidirectional bypass tunnel. " How does the upstream PLR detect that the message was received over the bypass tunnel? Is the assumption that the bypass LSP doesn't do penultimate hop popping? Is the assumption that the PLR can tell because RSVP indicates the downstream PLR as the previous hop in its signaling? Please clarify and describe how this detection is done - to ease interoperability. <RG> RFC 4090 has details on what is changed when primary LSP messages are sent over the bypass. No change in the processing required in this document for this case for MP to detect FRR. <RG> We could indicate “using the procedure defined in [RFC4090]”. 5) In Sec 5.1.2: "When upstream PLR R4 receives the protected LSP Path messages over the restored link, if not already done, it starts sending Resv messages and traffic flow of the protected LSP over the restored link and stops sending them over the bypass tunnel." Is there a reason that "when the downstream PLR receives the protected LSP RESV messages over the restored link, if not already done, it starts sending Path messages and traffic flow of the protected LSP over the restored link and stops sending them over the bypass tunnel." doesn't also make sense to put in this section? If this is not a good idea, please explain clearly the issues that it causes. <RG> This was updated in the last revision to keep the processing symmetric – before FRR and for restoration after FRR. I am assuming that "after the link is restored" implies that bidirectional communication has been successfully tested - not merely that the physical layer is up but also that an IGP or BFD is successful across it. (But this is standard for RSVP-TE FRR). 6) Sec 5.2.2: The behavior of R4 is not described. When the link from R3-R4 fails, R4 will redirect traffic to R2. As written at the start of Sec 5, R4 does not start sending its Resv across the bypass tunnel and R2 is thus not triggered to use its bypass tunnel. Please clearly describe this and why. It is this asymmetry in behavior for the downstream PLR and upstream PLR that causes the downstream PLR's bypass tunnel to be prioritized. <RG> R4 is not involved in re-corouting phase. It does normal FRR processing (e.g. Section 5.1.1). 7) Sec 5.2.2: The need for the PRR to look up the bypass tunnel and then reprogram the forwarding plane is quite concerning for having this operate at significant scale. What could be done if one assumes that the selected bypass tunnel - from the BYPASS_PROTECTION handling - is used? Is there a reason that decision has to be redone here? What is the issue that the solution is trying to work around? I can certainly imagine scenarios with BFD sessions so that the PRR can be rapidly failed over as the result of the BFD session going down. What scale of LSPs are you expecting this scenario to handle? <RG> This is not a “normal” case. 8) Sec 5.2.2: Given that the PRR will TEAR DOWN the LSP if it can't find a matching bypass tunnel, it would be quite useful for the ingress to have visibility as to the protection available. In RFC 4090, Sec 4.4 defines both "local protection available" and "local protection in use" flags in the IPv4/IPv6 sub-objects. Clearly, that isn't sufficient for the co-routed case because the ingress needs to know also that "local upstream protection available" and perhaps "local upstream protection in use". <RG> Yes, these flags are definitely used, see Section 4.4. 9) Sec 5.2.3: " o The upstream PLR R4 starts sending the traffic flow of the protected LSP over the restored link towards downstream PLR R3 and forwarding the Path messages towards PRR R5 and stops sending the traffic over the bypass tunnel. o When upstream PLR R4 receives the protected LSP Path messages over the restored link, if not already done, it starts sending Resv messages and traffic flow over the restored link towards downstream PLR R3 and forwarding the Path messages towards PRR R5 and stops sending them over the bypass tunnel." In the referenced figures, R4 is NOT an upstream PLR; that is R5. R4 could have forgotten all state associated with the bi-directional LSP. Please fix the text to actually describe the behavior. <R4> R4 is the node where restored link is detected in Figure 3. So it is doing the upstream PLR processing for link restoration case. 10) Sec 5.3: " Unidirectional link failures can result in the traffic flowing on asymmetric paths in the forward and reverse directions. In addition, unidirectional link failures can cause RSVP soft-state timeout in the control-plane in some cases. As an example, if the unidirectional link failure is in the upstream direction (from R4 to R3 in Figures 1 and 2), the downstream PLR (node R3) can stop receiving the Resv messages of the protected LSP from the upstream PLR (node R4 in Figures 1 and 2) and this can cause RSVP soft-state timeout to occur on the downstream PLR (node R3)." Is the assumption that there is no IGP or BFD running on the link? If not, then the IGP or BFD session will go down on the link first, making it unavailable to RSVP-TE and should trigger the fast-reroute. Also - given this issue, why does the upstream MP not start using the bypass tunnel when receiving Resv through a bypass tunnel? There is no explanation in the draft and there should be - to prevent incorrect "optimizations". Ideally, the draft would specify something like MUST NOT or SHOULD NOT with explanation - if that is the case. <RG> GMPLS signaling has master/slave model. So Forward direction is always a master and reverse direction is slave, this is to avoid oscillations where two sides starts making independent decisions. 11) Sec 7.1: The description for the BYPASS_ASSIGNMENT completely fails to be clear as to whether the contents are for the bypass tunnel used by the node inserting it into the RRO or whether the contents are a direction for the node that receives it - based on the Node ID that is included. <RG> Node inserts the FEC of the bypass tunnel it assigns locally which is then used by the MP for lookup. ---------------------------------------------------------------------- COMMENT: ---------------------------------------------------------------------- a) Sec 5.2.2.1: The approach suggested here seems fairly intensive from a forwarding plane perspective. It would be very helpful to indicate the range of expected/desired time for the fail-over. <RG> This is same as control-plane except the FRR on MP side is detected by the data-plane. b) Sec 5.2: This section is about node failures - but while the bypass tunnels are node-protecting, the failures discussed are only link. A brief example that describes the expected signaling for an actual node failure would be helpful. <RG> There should be no difference in processing if link or node fails, as long as bypass tunnel is next-next-hop. Thanks, Rakesh _______________________________________________ Teas mailing list Teas@ietf.org https://www.ietf.org/mailman/listinfo/teas
- [Teas] Alia Atlas' Discuss on draft-ietf-teas-gmp… Alia Atlas
- Re: [Teas] Alia Atlas' Discuss on draft-ietf-teas… Rakesh Gandhi (rgandhi)