Re: [bess] WGLC,IPR and implementation poll for draft-ietf-bess-mvpn-fast-failover

<zhang.zheng@zte.com.cn> Tue, 12 February 2019 07:44 UTC

Return-Path: <zhang.zheng@zte.com.cn>
X-Original-To: bess@ietfa.amsl.com
Delivered-To: bess@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 2A0B9124BAA; Mon, 11 Feb 2019 23:44:38 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level:
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, UNPARSEABLE_RELAY=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id bmxh5Z4AoT56; Mon, 11 Feb 2019 23:44:34 -0800 (PST)
Received: from mxhk.zte.com.cn (mxhk.zte.com.cn [63.217.80.70]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 034CE124408; Mon, 11 Feb 2019 23:44:33 -0800 (PST)
Received: from mxct.zte.com.cn (unknown [192.168.164.217]) by Forcepoint Email with ESMTPS id 33BFC3599E7DC139AC97; Tue, 12 Feb 2019 15:44:31 +0800 (CST)
Received: from mse01.zte.com.cn (unknown [10.30.3.20]) by Forcepoint Email with ESMTPS id 1F2D843F478BEC2F59E4; Tue, 12 Feb 2019 15:44:31 +0800 (CST)
Received: from njxapp05.zte.com.cn ([10.41.132.204]) by mse01.zte.com.cn with SMTP id x1C7iMjv021820; Tue, 12 Feb 2019 15:44:22 +0800 (GMT-8) (envelope-from zhang.zheng@zte.com.cn)
Received: from mapi (njxapp04[null]) by mapi (Zmail) with MAPI id mid203; Tue, 12 Feb 2019 15:44:24 +0800 (CST)
Date: Tue, 12 Feb 2019 15:44:24 +0800
X-Zmail-TransId: 2afc5c627958d6252a13
X-Mailer: Zmail v1.0
Message-ID: <201902121544240415726@zte.com.cn>
In-Reply-To: <CA+RyBmX6RgJ95ptcexEzH8R0ns8xxUhMgL2aAK8xgewcw0-z3w@mail.gmail.com>
References: CA+RyBmXcu3b9dObX=G9vyHNJtEuJ4wWqMtQXvxCNxgNOSCsmWw@mail.gmail.com, CA+RyBmX6RgJ95ptcexEzH8R0ns8xxUhMgL2aAK8xgewcw0-z3w@mail.gmail.com
Mime-Version: 1.0
From: zhang.zheng@zte.com.cn
To: gregimirsky@gmail.com
Cc: zzhang@juniper.net, bess-chairs@ietf.org, thomas.morin@orange.com, rkebler@juniper.net, bess@ietf.org
Content-Type: multipart/mixed; boundary="=====_001_next====="
X-MAIL: mse01.zte.com.cn x1C7iMjv021820
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/HHxtuzaMqcTZI_t-Bq-4bBhsJNg>
Subject: Re: [bess] WGLC,IPR and implementation poll for draft-ietf-bess-mvpn-fast-failover
X-BeenThere: bess@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bess>, <mailto:bess-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess/>
List-Post: <mailto:bess@ietf.org>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bess>, <mailto:bess-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Feb 2019 07:44:38 -0000

Hi Greg,

Thank you for your good modification and clarification!
About two sections I still have some comments, I copy the contents here because the mail is too long:
1,
3. I am confused with section 3.1.1/3.1.2/3.1.3. IMO only the X-PMSI tunnel's state influence the BFD session, there is no need for other decision. 
GIM2>> The Upstream PE as MultipointHead of the p2mp BFD session may use a combination of conditions (the combination being determined by policy) to control the state of the BFD session, e.g., set it to AdminDown. I think that the use of policy to control the conditions that affect the P-tunnel reception state is the advantage of the proposed solution. What do you think?
4. For section 3.1.5, IMO the counter method has no relationship with the BFD function defined in this document. If the counter method will be used as a supplement for BFD session?
GIM2>> As above, this is one of the conditions, controlled by the policy, that may be considered to influence the state of the BFD session.
Sandy2> Since BFD packet is forwarding through by x-PMSI tunnel, egress PE can get the tunnel states by BFD detection timer expiration. So administrator may choose different policies to control the session state, but the bfd packets detection should be the base. IMO section 3.1.1~4 are optional. 
For section 3.1.5 counter information, how do the configurable timer work with the bfd detection timer? What should egress PE do with the expiration of the two timers when they are both used?

2.
For section 3.1.7.1, the last sentence.
GIM2>> I think that Jeffrey asked why the new BFD Discriminator must be sent and the new p2mp BFD session must be initiated. Your question, as I interpret it, is to how operationally an implementation can minimize the disruption when the new BFD session advertised to replace one that already exists. Firstly, would we agree that sending the new BGP-BFD Discriminator and starting the new p2mp BFD session when the RPF interface changes is the right action? If we agree, then I can add a sentence or two to describe optional procedure for the upstream PE to minimize the disruption when the egress PE switches to the new p2mp BFD session.
Sandy2>If the "old" BFD discriminator can be reused in the new advertisement when the switchover is happened on a same upstream PE? If the "old" discriminator can be reused, it seems like it isn't necessary to build a new BFD session. But if a new BFD discriminator MUST be generated, then the new add sentence looks good to me.

Thanks,
Sandy

------------------原始邮件------------------
发件人:GregMirsky <gregimirsky@gmail.com>
收件人:张征00007940;
抄送人:zzhang@juniper.net <zzhang@juniper.net>;bess-chairs@ietf.org <bess-chairs@ietf.org>;Thomas Morin <thomas.morin@orange.com>;Robert Kebler <rkebler@juniper.net>;BESS <bess@ietf.org>;
日 期 :2019年02月07日 08:16
主 题 :Re: [bess] WGLC,IPR and implementation poll for draft-ietf-bess-mvpn-fast-failover
Hi Sandy,much appreciate your comments. Please find my answers below tagged GIM2>>.
Attached, please find the updated working version and the diff to the last published version.

Kind regards,
Greg

On Thu, Jan 31, 2019 at 7:40 PM <zhang.zheng@zte.com.cn> wrote:
Hi Greg, Jeffrey, co-authors,

About the questions provided by Jeffrey, I have some concerns, please see below with Sandy>.
And I have some other questions:
1. According to "draft-ietf-bfd-multipoint-19" and the function defined in this draft, IMO the BFD session should be demultiplexed by the combination of upstream peer address, the discriminator and the X-PMSI which is used for flow forwarding. IMO these content should be written in the draft clearly.
GIM2>> Agreed and to clarify I propose the following update to the Downstream PE Procedures:
OLD TEXT:
On receiving the BGP-BFD Attribute in the x-PMSI A-D Route, the
Downstream PE:

o  MUST associate the received BFD discriminator value with the
P-tunnel originating from the Root PE;

o  MUST create p2mp BFD session and set bfd.SessionType =
MultipointTail as described in [I-D.ietf-bfd-multipoint];

o  MUST use the source IP address of a BFD control packet, the value
of BFD Discriminator from the BGP-BFD Attribute to properly
demultiplex BFD sessions;

NEW TEXT:
Upon receiving the BGP-BFD Attribute in the x-PMSI A-D Route, the
Downstream PE:

o  MUST associate the received BFD discriminator value with the
P-tunnel originating from the Root PE and the IP address of the
Upstream PE;

o  MUST create p2mp BFD session and set bfd.SessionType =
MultipointTail as described in [I-D.ietf-bfd-multipoint];

o  MUST use the source IP address of the BFD control packet, the
value of the BFD Discriminator field, and the x-PMSI tunnel
identifier the BFD control packet was received to properly
demultiplex BFD sessions.

2. The P2MP BFD packet should be delivered in the X-PMSI tunnel. The BFD multicast packet MUST be encapsulated in associated tunnel. It seems like there is no specifiction for it.
GIM2>> Agree and to clarify I propose the following text to be added to the Upstream PE Procedures section:
NEW TEXT:
o  MUST periodically transmit BFD control packets over the x-PMSI
tunnel.
3. I am confused with section 3.1.1/3.1.2/3.1.3. IMO only the X-PMSI tunnel's state influence the BFD session, there is no need for other decision.
GIM2>> The Upstream PE as MultipointHead of the p2mp BFD session may use a combination of conditions (the combination being determined by policy) to control the state of the BFD session, e.g., set it to AdminDown. I think that the use of policy to control the conditions that affect the P-tunnel reception state is the advantage of the proposed solution. What do you think?
4. For section 3.1.5, IMO the counter method has no relationship with the BFD function defined in this document. If the counter method will be used as a supplement for BFD session?
GIM2>> As above, this is one of the conditions, controlled by the policy, that may be considered to influence the state of the BFD session.

Thanks,
Sandy
原始邮件
发件人:GregMirsky <gregimirsky@gmail.com>
收件人:zzhang@juniper.net <zzhang@juniper.net>;
抄送人:bess-chairs@ietf.org <bess-chairs@ietf.org>;Thomas Morin <thomas.morin@orange.com>;Robert Kebler <rkebler@juniper.net>;BESS <bess@ietf.org>;
日 期 :2018年12月06日 02:38
主 题 :Re: [bess] WGLC,IPR and implementation poll for draft-ietf-bess-mvpn-fast-failover
_______________________________________________
BESS mailing list
BESS@ietf.org
https://www.ietf.org/mailman/listinfo/bess

Hi Jeffrey,thank you for the review, detailed questions and helpful comments. Please find my notes, answers in-line tagged GIM>>.

Regards,
Greg

On Fri, Nov 30, 2018 at 5:14 PM Jeffrey (Zhaohui) Zhang <zzhang@juniper.net> wrote:
Hi,
I have the following questions/comments:
The procedure described here is an OPTIONAL procedure that consists
of having a downstream PE take into account the status of P-tunnels
rooted at each possible upstream PEs, for including or not including
each given PE in the list of candidate UMHs for a given (C-S,C-G)
state.  The result is that, if a P-tunnel is "down" (see
Section 3.1), the PE that is the root of the P-tunnel will not be
considered for UMH selection, which will result in the downstream PE
to failover to the upstream PE which is next in the list of
candidates.
Is it possible that a p2mp tunnel is considered up by some leaves  but down by some other leaves, leaving to them choosing different UMH? In that case, procedures described in Section 9.1.1 ("Discarding Packets from Wrong PE") of RFC 6513 must be used. I see that this is actually pointed out later in section 6 – good to have  a pointer to it right here.
GIM>> Would the following new text that follows the quoted text address your concern:
NEW TEXT:
If rules to determine the state of the P-tunnel are not
consistent across all PEs, then some may arrive at a different
conclusion regarding the state of the tunnel, In such a scenario,
procedures described in Section 9.1.1 of [RFC 6513] MUST be used.
Sandy> I think Jeffrey means that a egress PE may choose a new UMH after the the "old" UMH fails. Then the egress PE may also receive (C-S, C-G) flows from old UMH p-tunnel, these flows MUST be discarded according to section 9.1.1 of RFC6513.
GIM2>> I think that the proposed text may address the comment. I'm,as always, open to suggestions on how to modify, refine the proposed new text.
Additionally, the text in section 3 seems to be more biased on Single  Forwarder Election choosing the UMH with the highest IP address. Section 5 of RFC6513 also describes two other options, hashing or based on “installed UMH route” (aka unicast-based). It is not clear how the text in this document applies to hashing based selection,  and I don’t see how the text applies to unicast-based selection. Some rewording/clarification are needed here.
GIM>> How would the use of an alternative UMH selection algorithm change documented use of p2mp BFD? Do you think that if the Upstream PE selected using, for example, hashing then defined use of BGP-BFD and p2mp BFD itself no longer applicable?
Sandy> Diffrent UMH selection methods don't influent p2mp BFD documented in this draft. IMO both of section 3 and section 5 need to be mentioned here in order to avoid confusion.
GIM2>> Very helpful clarification, thank you. Please consider the following update to section 4:
OLD TEXT:
The procedures
require all the PEs of that MVPN to follow the single forwarder PE
selection, as specified in [RFC6513].
NEW TEXT:
The procedures
require all the PEs of that MVPN to follow the single forwarder PE
selection, as specified in [RFC6513], whether the PE selected based
on its IP address, hashing algorithm described in section 5.1.3
[RFC6513], or Installed UMH Route.
For P-tunnels of type P2MP MPLS-TE, the status of the P-tunnel is
considered up if one or more of the P2MP RSVP-TE LSPs, identified by
the P-tunnel Attribute, are in Up state.
Why is “one or more of …” used in the above text?
GIM>> Would s/one or more of/at least one of/ address your concern?
Sandy> I am not sure there are the situations that  two or more LSPs are used to deliver a same (C-S, C-G). IMO only the LSP used by forwarding need to be mointor in egress PE.
GIM>> I need to defer this to Thomas and Rob. If you agree with Sandy, should we just remove the sentence?
There are several occurrences of ((S, G)). I assume they should be  changed to (C-S, C-G).
GIM>> Indeed, globally replaced s/((S,G))/(C-S,C-G)/
A PE can be removed from the UMH candidate list for a given ((S, G))
if the P-tunnel for this (S, G) (I or S , depending) is leaf
triggered (PIM, mLDP)
Perhaps either remove the (I or S , depending)or  move it to before the “for”.
GIM>> Moved before the "for".
This document defines the format and ways of usingr a new BGP
attribute called the "BGP- BFD attribute".
s/usingr/using/
GIM>> Yes, great catch.
o  MUST use [Ed.note] address as destination IP address when
transmitting BFD control packets;
[Ed.note]?
GIM>> Replaced [Ed...note] to make it as follows:
o  MUST use address in 127.0.0.0/8 range for IPv4 or in
0:0:0:0:0:FFFF:7F00:0/104 range for IPv6 as destination IP address
when transmitting BFD control packets;
If tracking of the P-tunnel by using a p2mp BFD session is to be
enabled after the P-tunnel has been already signaled, the the
procedure described above MUST be followed.
What if the tracking is to  be enabled before the P-tunnel has been signaled? The text implies different behavior?
GIM>> Not really, I guess. I think that the second sentence is important:
Note that x-PMSI A-D Route MUST be re-sent with exactly the same attributes as before and
the BGP-BFD Attribute included.
s/the the/then the/
GIM>> Done.
… The dedicated p2mp BFD session MAY monitor the state of
the Standby Upstream PE.
What does the above text mean? Do you mean “A different p2mp BFD session  …”?
GIM>> Yes, thank you for the suggested re-wording. Applied s/The dedicated/A different/
When such a procedure is used, in the context where fast restoration
mechanisms are used for the P-tunnels, leaf PEs should be configured
to wait before updating the UMH, to let the P-tunnel restoration
mechanism happen.  A configurable timer MUST be provided for this
purpose, and it is recommended to provide a reasonable default value
for this timer.
What does “such a procedure” refers to?
GIM>> Would s/When such a procedure is used/In such a scenario/
s/recommended/RECOMMENDED/?
GIM>> Great catch, thank you. Done.
3.1.7.  Per PE-CE link BFD Discriminator
The following approach is defined for the fast failover in response
to the detection of PE-CE link failures, in which UMH selection for a
given C-multicast route takes into account the state of the BFD
session associated with the state of the upstream PE-CE link.
3.1.7.1.  Upstream PE Procedures
For each protected PE-CE link, the upstream PE initiates a multipoint
BFD session [I-D.ietf-bfd-multipoint] as MultipointHead toward
downstream PEs.  A downstream PE monitors the state of the p2mp
session as MultipointTail and MAY interpret transition of the BFD
session into Down state as the indication of the associated PE-CE
link being down.
Since  the BFD packets are sent over the P2MP tunnel not the PE-CE link, my understanding is that the BFD discriminator is still for the tunnel and not tied to the PE-CE link; but different from the previous case, the root will stop sending BFD messages when it detects  the PE-CE link failure. As far as the egress PEs are concerned, they don’t know if it is the tunnel failure or PE-CE link failure.
If my  understanding is correct, the wording should be changed.
GIM>> There are other than stopping transmission of BFD control packets ways to distinguish two conditions for the egress PE. For example, the MultipointHead MAY set the State to AdminDown and continue sending BFD control packets. If and when PE-CE link restored to Up, the MultipointHead can set the state to Up in the BFD control packet.
Sandy> I agree with Jeffrey. The BFD detection should be mapping to specific flow/flows associated with X-PMSIs, not the PE-CE link. The PE-CE link should influence the X-PMSIs and associated (C-S, C-G) flows. The AdminDown function defined in BFD works normally.
GIM2>> The described behavior of the egress PE is optional and can be controlled by the local policy.
…  If the route to the
src/RP changes such that the RPF interface is changed to be a new PE-
CE interface, then the upstream PE will update the S-PMSI A-D route
with included BGP-BFD Attribute so that value of the BFD
Discriminator is associated with the new RPF link.
If the RPF interface changes on the upstream PE, why should it update  the route to send a new discriminator? As long as there is a new RPF interface couldn’t the upstream PE do nothing but start tracking the new RPF interface?
GIM>> I'll defer this one to Thomas and Rob.
Sandy> I have the same question with Jeffrey. If RPF interface changes on the upstream PE, and a new route generated with a new BFD discriminator, a new P2MP BFD session need to be established and the network stability will be influenced. We need a function to guarantee the existed BFD session should not be influenced.
GIM2>> I think that Jeffrey asked why the new BFD Discriminator must be sent and the new p2mp BFD session must be initiated. Your question, as I interpret it, is to how operationally an implementation can minimize the disruption when the new BFD session advertised to replace one that already exists. Firstly, would we agree that sending the new BGP-BFD Discriminator and starting the new p2mp BFD session when the RPF interface changes is the right action? If we agree, then I can add a sentence or two to describe optional procedure for the upstream PE to minimize the disruption when the egress PE switches to the new p2mp BFD session.
Regardless which way (the currently described way and my imagined  way), some text should be added to discuss how the downstream would not switch to another upstream PE when the primary PE is just going through a RPF change.
GIM>>  Would appending the following text be acceptable to address your concern:
NEW TEXT:
To avoid unwarranted switchover a downstream PE MUST gracefully handle the
updated S-PMSI A-D route and switch to the use of the associated BFD
Discriminator value.
4.  Standby C-multicast route
The procedures described below are limited to the case where the site
that contains C-S is connected to exactly two PEs. The procedures
require all the PEs of that MVPN to follow the single forwarder PE
selection, as specified in [RFC6513].
Why would it not work with more than two upstream PEs?
Why is it limited to single forwarder selection? What about unicast  based selection?
GIM>> Again, asking for Thomas and Rob to help.
Sandy> I agree with Jeffrey. There is no limition for advertising same flows through more than two PEs. Maybe the text should be modify to the UMH and the next best UMH.
GIM2>>  Thank you for the suggestion. Jeffrey and Sandy, would the following update address your concerns:
OLD TEXT:
The procedures described below are limited to the case where the site
that contains C-S is connected to exactly two PEs.  The procedures
require all the PEs of that MVPN to follow the single forwarder PE
selection, as specified in [RFC6513].
NEW TEXT:
The procedures described below are limited to the case where the site
that contains C-S is connected to two or more PEs though, to simplify
the description, the case of dual-homing is described.  The
procedures require all the PEs of that MVPN to follow the UMH
selection, as specified in [RFC6513], whether the PE selected based
on its IP address, hashing algorithm described in section 5.1.3
[RFC6513], or Installed UMH Route.
This route, that has the semantics of being a 'standby'
C-multicast route, is further called a "Standby BGP C-multicast
route", and is constructed as follows:
o  the NLRI is constructed as the original C-multicast route, except
that the RD is the same as if the C-multicast route was built
using the standby PE as the UMH (it will carry the RD associated
to the unicast VPN route advertised by the standby PE for S)
Since you mention RD, you might as well mention it carries a Route  Target derived from the standby RE’s UMH route’s VRF RT Import EC.
GIM>> Woud the following be acceptable:
NEW TEXT:
o  the NLRI is constructed as the original C-multicast route, except
that the RD is the same as if the C-multicast route was built
using the standby PE as the UMH (it will carry the RD associated
to the unicast VPN route advertised by the standby PE for S and a
Route Target derived from the standby PE's UMH route's VRF RT
Import EC)
If at some later point the local PE determines that C-S is no longer
reachable through the Primary Upstream PE, the Standby Upstream PE
becomes the Upstream PE, and the local PE re-sends the C-multicast
route with RT that identifies the Standby Upstream PE, except that
now the route does not carry the Standby PE BGP Community (which
results in replacing the old route with a new route, with the only
difference between these routes being the presence/absence of the
Standby PE BGP Community).
Additionally the LOCAL_PREF should also change?
GIM>> Like normative SHOULD?
4.3.  Reachability determination
The standby PE can use the following information to determine that
C-S can or cannot be reached through the primary PE:
Shouldn’t this be 4.2.1 instead of 4.3?
GIM>> Yes, agree. Thank you.
5.  Hot leaf standby
The mechanisms defined in sections Section 4 and Section 3 can be
used together as follows.
This section is a little confusing to me. It seems that it really  should be how a leaf should behave when hot root standby is used, not that there is a “hot leaf” mode. A leaf is just a leaf, not a cold/warm/hot/primary/standby leaf.
GIM>> Would re-naming the section to "Use of Standby C-multicast Route" better reflect the content of the section?
Thanks.
Jeffrey
From: BESS <bess-bounces@ietf.org> On Behalf Of stephane.litkowski@orange.com
Sent: Thursday, November 22, 2018 2:54 AM
To: bess@ietf.org
Cc: bess-chairs@ietf.org
Subject: [bess] WGLC, IPR and implementation poll for draft-ietf-bess-mvpn-fast-failover

Hello Working Group,
This email starts a two-week Working Group Last Call on draft-ietf-bess-mvpn-fast-failover-04  [1]

This poll runs until *the 6th of December*.

We are also polling for knowledge of any undisclosed IPR that applies to this Document, to ensure that IPR has been disclosed in compliance with IETF IPR rules (see RFCs 3979, 4879, 3669 and 5378 for more details).
If you are listed as an Author or a Contributor of this Document please respond to this email and indicate whether or not you are aware of any relevant undisclosed IPR. The Document won't progress without answers from all the Authors  and Contributors.

Currently two IPRs have been disclosed against this Document.

If you are not listed as an Author or a Contributor, then please explicitly respond only if you are aware of any IPR that has not yet been disclosed in conformance with IETF rules.

We are also polling for any existing implementation as per [2].
Thank you,
Stephane & Matthew

[1]  https://datatracker.ietf.org/doc/draft-ietf-bess-mvpn-fast-failover/

[2]  https://mailarchive.ietf.org/arch/msg/bess/cG3X1tTqb_vPC4rg56SEdkjqDpw
_________________________________________________________________________________________________________________________
Ce message et ses pieces jointes peuvent contenir des informations confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages electroniques etant susceptibles d'alteration,
Orange decline toute responsabilite si ce message a ete altere, deforme ou falsifie. Merci.
This message and its attachments may contain confidential or privileged information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete this message and its attachments.
As emails may be altered, Orange is not liable for messages that have been modified, changed or falsified.
Thank you.