[spring] 答复: 答复: Comments on draft-geng-spring-sr-redundancy-protection

Hi Jeffrey,
Please see inline reply starts with Fan2>>.
Regards,
Fan

-----邮件原件-----
发件人: spring [mailto:spring-bounces@ietf.org] 代表 Jeffrey (Zhaohui) Zhang
发送时间: 2021年3月26日 3:19
收件人: Gengxuesong (Geng Xuesong) <gengxuesong@huawei.com<mailto:gengxuesong@huawei.com>>; spring@ietf.org<mailto:spring@ietf.org>; Rishabh Parekh (riparekh) <riparekh@cisco.com<mailto:riparekh@cisco.com>>; Arvind Venkateswaran (arvvenka) <arvvenka@cisco.com<mailto:arvvenka@cisco.com>>
主题: [spring] Comments on draft-geng-spring-sr-redundancy-protection

Hi Xuesong, Mach, Fan,

Some comments/questions on the proposal.

1. We don't need an additional "redundancy segment" for the replication semantics. Existing "replication segment" (draft-ietf-spring-sr-replication-segment) can be used as is, especially for the scenario where the original header already carries (FI, SN) information.

------[FY1]: three considerations here:

a). For the scenario you mentioned, that is correct, redundancy segment and replication segment share a common behavior of "packet duplication". The significant difference between two segments is the behavior of adding FI and SN. Unfortunately, there is no application in SRv6 required to carry (FI,SN) information in IPv6 header, which results in a more common scenario is where the original packet doesn't carry (FI, SN). So the current design of redundancy segment is based on this scenario.

 Zzh> Since the presentation talked about scenario where the (FI, SN) information is already carried, it is fair to discuss that in my initial comments; I understand that you want to focus on the other scenario, and that’s fine – see later comments below.

Fan1>> Before we dive into the detailed design, I would like to come back to discuss the two scenarios first. Before the traffic is about to be replicated,  we name scenario 1 is the traffic has Flow Identification (FI) already:

In this case, FI could be carried either as IPv6 Flow Label in IPv6 basic header or in other EH TLVs. RFC6437 specifies the usage of Flow Label for stateless load distribution, and many existing implementations follow. Since redundancy protection and ECMP can be needed in the network at the same time,  flow label is not possible to act with two semantics unless RFC6437 is extended. In other word, at present flow label cannot be used to carry FI for redundancy protection.

To carry FI in IPv6 EH TLVs, currently there is no RFC specifies it or similar idea. It is just based on imagination. The only reason I can understand is that controller has already recognized this flow to perform redundancy protection somewhere, but the replication is not planned to happen at headend. So it assigns FI at the headend in SRv6 policy together with SID list.  The potential reason could be the headend does not have branches itself, SID list represents an E2E path for the service, but the multiple redundant paths only exist as a subnet of the entire service path, or bandwidth saving in network. If it is the case, it just means two choices to assign FI, either at headend or redundancy node. Under this circumstance, we should discuss which place is better to mark FI into packet. In the draft, we insist on adding FI at redundancy node, as FI is not necessarily to be globally managed. So it comes back to the second scenario- there is no FI in packet. All in all, there is only one scenario, where FI is to be encapsulated at the redundancy node, not before.

I didn’t put SN here, because actually FI and SN are different. It is reasonable to assign FI from controller, as FI is flow-based parameter. But SN should be encapsulated on the endpoint itself as it is a packet-based parameter. Based on this, I am afraid no one will choose to assign FI at headend, then separately add SN and replicate packet at the redundancy node.

Thus, for redundancy protection, both (FI,SN) adding and packet replication should be included in the endpoint behavior of redundancy segment.

Zzh3> In the above long paragraphs you explain why you think it is better to add FI/SN at the replication node. Even in the case where the (FI,SN) is added at the replication node, using replication segment augmented with semantics of adding (FI,SN) still works well.

Fan2>> No problem. It is important to understand the actual scenario, so that redundancy protection can be properly designed based on correct assumptions .

Zzh3> As for whether it is desired to add FI/SN at the headend, I would say there are certainly good reasons to do so, but I will defer that to a separate discussion.

Fan2>> Sure, expect to start this topic.

Fan>> I read the draft of replication segment, and have two questions if replication segment is used in redundancy protection.

1) I believe merging node should be as the downstream node, since the nodes in precedence of merging node should not be redundancy protection aware. In this case, there will be at least two identical downstream nodes. In replication segment, there is no definition of such a situation.

Zzh2> That is not explicitly excluded, and that does not mean it can’t be used.

Fan2>> Yes, it will import more parameters to replication SID, although replication SID has already had a complex logical structure.

2) The draft states replication SID MUST only appear as the ultimate SID in a SID list. What if the merging node is not the last node of the SRv6 E2E path?

 Zzh2> There is a requirement that there must be no “topological” SID. The intention is to prevent the situation where a node side comes after the replication SID, causing duplicate packets to that node. That is reasonable for the original intention of replication segment, but now it is reasonable to remove that because of this new use case of replication segment where we do want the replicated packets to the same merging node. We’d rather remove the restriction instead of defining a new segment.

Fan1>> if this restriction is removed, as draft-ietf-spring-sr-replication-segment states, the behavior at Downstream node of a replication segment is undefined. What is the solution here?

Zzh3> As I said already, the reason for that document to state so is because the topological segment would get duplicated packets. We did not think that makes sense in a regular replication situation, but obviously the redundancy use case is perfectly fine, so we will remove that text or modify accordingly to point out where it makes sense.

Fan2>> I understand there is actually a forwarding blackhole on merging node if it is not the last hop of SR path. Because in term of replication segment, merging node is the downstream node, and downstream node is also represented  as replication segment. For simplicity, merging node is assumed as leaf node. According to End.Replicate definition, MPLS label or IPv6+SRH header is removed at this time. There is no definition on how to forward the inner packet to next hop.

Unless End.Replicate is changed, simply removing the restriction of“MUST only appear as the ultimate SID in a SID-List”doesn’t work.

Moreover, as we discussed, if replication segment is used as redundancy segment,  the downstream node is actually the merging node. Merging node has its own endpoint behavior. I understand in replication segment definition, leaf node performs the endpoint behavior of replication segment.  Are you going to define another branch of merging segment endpoint behavior inside the replication segment?

Zzh3> No. Just put the your merging segment after the replication segment. The only change to replication segment is that for the replication node, you may augment it with the semantics of adding FI/SN. No other changes at all.

Fan2>> Draft-ietf-spring-sr-replication-segment states“Notice that the segment on the leaf node is still referred to as a Replication segment for the purpose of generalization.” In other word, segment on merging node is always replication segment, no way to perform the merging behavior defined in merging segment.

b). Even though IPv6 flow label could be encapsulated in header, it is used for ECMP or fragmentation, redundancy protection cannot simply reuse it since flow ID allocation has dependency on the merging node capability.

Zzh> IPv6 flow label is irrelevant here – it’s not discussed in either your draft/presentation or in my comments – so we can ignore this.

Fan>> I mentioned IPv6 flow label coz we had this discussion in DetNet WG. I agree we can come back to this thread when it is needed.

c). In protocol design, it is important to maximally reuse the existing implementation. However, instantiation of a segment is a different story. In RFC8986, there are 14 End behaviors and 4 headend behaviors defined. We understand the principle here is to keep the semantics of a segment and further functions definition neat to make the segment routing forwarding clear and efficient. To enhance the replication segment to support redundancy segment seems quite an opposite methodology.

 Zzh> RFC 8986 does specify additional flavors of End and End.X function with USP, PSP and USD behaviors which are modifications to base End and End.X function; exactly what we are proposing here – enhancing Replication Segment to add (FI,SN) when required.

Fan1>>If every function can be enhanced to one segment, it is really not necessary to define 15 End behaviors in RFC8986. One complex End behavior can do everything.

Fan>> can you explain more? I don’t see correlation between flavors and adding (FI,SN).

 2. Even for the scenario where the (FI, SN) information needs to be added by the redundancy node, the existing "replication segment" can be enhanced to add the (FI, SN) information.

-----[FY2]: Replication segment provides P2MP replication with target of supporting multicast service, and redundancy segment aims to provide redundant flow protection to URLLC services. Adding (FI, SN) doesn’t bring value to multicast services, and having the stitching capability of replication on redundancy node seems a waste and unpractical to URLLC service. Twisting them together in one segment results in a complicated function, where maybe only one type of service is required on the node.

 Zzh> Adding (FI, SN) information is only to replication segments that are used for replication for unicast redundancy purpose. It does not mean all replication segments will be added with (FI, SN) semantics.

 Fan>> How would you write the Boolean switch to differentiate the purpose of multicast replication and redundancy protection in one segment? And currently we don’t exclude the redundancy protection for multicast traffic.

Zzh2> There are two ways to do it.

Zzh2> 1. A replication segment now carries an additional attribute about adding FI/SN information. That does mean the redundancy node cannot use the same replication segment for both regular replication (w/o adding FI/SN information) and redundancy replication purposes. However, that does not mean we should not extend the existing replication segment for redundancy purpose. Also note an interesting use of replication segments here – say the redundancy node is N1 (who adds the FI/SN information) but the actual replication node could be N2. The replication tree does start at N1 but only one copy is sent to N2, who does the real replication. Now N1 will have two replication segments – one for regular multicast purpose and one for redundancy, but they will share the same replication segments downstream (because only the redundancy node adds the FI/SN information).

Fan1>> in fact, I think you raise a very good example to explain why we should not put replication segment and redundancy segment together as one segment. It makes the service deployment so complicated and confused.

Replication SID and Tree SID is defined for the P2MP scenarios. Why there are two SIDs defined because multicast services have root, bud, and leaf roles. However in redundancy protection, redundancy node has very straightforward and unique semantics. The endpoint behavior can be defined simple and clean. Why would I abandon a new segment with clear endpoint behavior but choose to become a branch of another segment’s behaviors? The reason not to introduce another segment is not very sound. Because anyway, you need to differentiate the purposes of original replication and redundancy protection separately in replication segment. I don’t understand what exactly resources we are saving.

Zzh3> A replication segment is a simple building block that replicates packets to a bunch of downstream nodes (and each replication branch can have a segment list to specify the path). A replication tree made of concatenated replication segments provide P2MP service from a root to many leaves, potentially via intermediate nodes.

Zzh3> As such, a single replication segment can be used for redundancy purpose – w/o any changes at all if the replication node does not need to add (FI,SN), and w/ a simple augmentation (a Boolean indication) to add (FI,SN) if the replication nodes needs to add (FI,SN).

Fan2>> Agree. This part of modification is fine. The key problem is described above.

Zzh3> What I describe in the above zzh2> is another example of using an replication tree when you don’t want to put all the burden on a single node.

Fan2>> This example gives a hint that operator should pay more attention on service deployment when both multicast and redundancy protection services exist in network.

Zzh3> As you can see, the replication segment (w/ the (FI,SN) augmentation when needed) and SR-P2MP (aka tree-sid) provides all the redundancy needs.

Fan2>> IMHO SR P2MP policy and Tree-SID is totally unnecessary for redundancy protection.

SR P2MP policy is identified by tuple <Root, Tree-ID>. The two parameters are meaningless and inappropriate for redundancy protection service. There isn’t a tree or root at all.

In our draft, redundancy segment performs the packet replication and adds (FI,SN), redundancy policy provides multiple simultaneous paths. The mechanism is much simpler than SR-P2MP policy/Tree-SID.

We don’t want to put unnecessary burden on redundancy protection implementation.

Zzh2> 2. We can separate out the semantics of adding FI/SN. This is easy to do with SRv6 – just use the argument bits to indicate that. For MPLS, a separate label may be used before the regular replication SID – that label will add the FI/SN information and the following replication SID will do the replication.

Fan1>> Adding FI is flow based, I don’t think it is a good idea to use segment based argument to indicate it.

Zzh3> I don’t understand the logic here. If and only if the packets of a flow include a replication segment w/ the (FI,SN) indication, then you get the desired result.

Zzh2> Not excluding redundancy protection for multicast traffic is actually a good reason to use replication segment 😊 You can see that a replication segment, either with “adding FI/SN” semantics embedded or explicitly indicated by a preceding “add FI/SN” label or by a trailing “add FI/SN” SRv6 arg bits, can be used for both multicast and unicast traffic. In case of multicast, as long as two or more branches eventually converge to the merge node, redundancy protection is achieved.

Zzh> I don’t follow your argument about “seems a waste and unpractical to URLLC service”.

 Zzh> I don’t follow your argument about “Twisting them together in one segment results in a complicated function where maybe only one type of service is required on the node” either. If you only need regular multicast service, the replication segment does not need the semantics to add (FI, SN) information. If you need redundancy protection then the replication segment does have the semantics to add (FI, SN) information). If you need both, then some will have that semantics and some will not; and if you have a scenario where you don’t need to add (FI, SN) information for redundancy protection then the existing replication segment w/o the additional semantics to add (FI, SN) information can be used for both. All can be achieved with a simple Boolean switch added to the replication segment.

Fan>> after seeing all these “if, then” shown above, I even feel more strongly to support separating two segments. ☺ In RFC8986, there is no single Endpoint behavior having such “if, then” structure to specify different functions.

 Zzh> Note that Replication Segment is not tied to multicast either (the draft only mentioned multicast once as one use case):

   We define a new type of segment for Segment Routing [RFC8402<https://urldefense.com/v3/__https:/tools.ietf.org/html/rfc8402__;!!NEt6yMaO-gk!Xgth91A6kCK6jXojQgQDaqWbfJ99HWzdkEjEJg3Wt5JxGsQ9uLf_E9w2WwrIuotL$>], called

   Replication segment, which allows a node (henceforth called as

   Replication Node) to replicate packets to a set of other nodes

   (called Downstream Nodes) in a Segment Routing Domain. Replication

   segments provide building blocks for Point-to-Multipoint Service

   delivery …

Zzh> It is about replicating packets to a set of other nodes – quite applicable here as a building block.

Fan>> I do think replication segment has a very elegant design, however identical downstream nodes, design of P2MP SR policy (indirectly involves Tree-ID) may seem burden too much on redundancy segment. But it is still very welcome to have further discussion on replication segment and redundancy segment.

Zzh2> Please see comments earlier 😊

Zzh2> Also tree-id is not a concern. Tree-ID is only needed when multiple replication segments need to be signaled to different tree nodes. A simple redundancy case is like ingress replication and only a single replication segment is needed so tree-id is just an internal thing on the redundancy node. Regardless, the key is that the existing redundancy segment concept can be used for redundancy purpose.

Fan1>> Tree SID and Tree-ID are useless for redundancy protection, what semantics it should be for redundancy protection?

Zzh3> The above has no base. Please see my earlier zzh3> comments.

Zzh3> Jeffrey

3. I wonder why (FI, SN) information is added as a TLV in the SRH. Would it be better to use DOH?

-----[FY3]: If the (FI,SN) is encapsulated in type of TLV, both SRH and DOH are feasible. Actually (FI,SN) information is only meaningful to merging node, putting them in the arg part of replication segment doesn't help.

Zzh> While I do think it is better to put the actual (FI, SN) information in the DOH, I did not talk about adding (FI, SN) information to the arg part of an SRv6 SID. I was saying that the argument of an SRv6 replication SID can serve as that Boolean switch to indicate if (FI, SN) information needs to be added.

Fan>> so far, this approach works for me.

Zzh2> It can work, but since only the merging node use the FI/SN information, it is more of a DOH thing instead of SRH thing.

Zzh2> Thanks!

Zzh2> Jeffrey

For #1, and #2, reusing/enhancing existing replication segment has the following benefits:

a. Reduce protocol/implementation work

b. Reduce the amount of state in the network (the same P2MP tunnel can be used for both multicast traffic and unicast redundancy)

b) can be achieved even with #2 (redundancy node needs to add (FI, SN) information): for SRv6, the semantics of adding (FI, SN) can be indicated by the arg part of the replication SID and for SR-MPLS it can be indicated by an additional label in front of the replication sid label. If using an addition label is a concern, then indeed a single label can be used to indicate both "add FI/SN information" and "replicate", but still the replication semantics can still be set up using the replication segment infrastructure.

For SR-MPLS, where would you put the (FI, SN) information? Seems that GDFH (draft-zzhang-intarea-generic-delivery-functions) is a good option and that can be used for SRv6 as well (anything in DOH that is actually independent of IP could be extracted out to GDFH).

-----[FY4]: For SR-MPLS, currently the authors plan to keep consistent with specification in RFC8964. The original intention of this draft is to provide a PREOF solution in SR data plane to DetNet. What's why the draft is discussed first in DetNet then comes to SPRING. And FYI, DetNet MPLS data plane uses a separate service label (S-Label) and PW MPLS Control Word [RFC4385] to carry FI and SN.

Zzh> I forgot that DETnet mpls data plane already reuses PW CW for SN information. That’s fine and no need to introduce GDFH for MPLS.

Zzh> Thanks.

Fan>> thanks for bring up this topic to a deeper discussion. Redundancy protection should be taken into consideration for both SP and vendor if URLLC services should be guaranteed.

Zzh> Jeffrey

Thanks.

Jeffrey

Juniper Business Use Only

_______________________________________________

spring mailing list

spring@ietf.org<mailto:spring@ietf.org>

https://www.ietf.org/mailman/listinfo/spring<https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/spring__;!!NEt6yMaO-gk!Rk0PGf0pg0nFb0yo3yrw4HCuRzBBn_xDVWjwUQ9HKkn1db_vI48SfuShKITTo6uG$>

Juniper Business Use Only
_______________________________________________
spring mailing list
spring@ietf.org<mailto:spring@ietf.org>
https://www.ietf.org/mailman/listinfo/spring<https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/spring__;!!NEt6yMaO-gk!Xgth91A6kCK6jXojQgQDaqWbfJ99HWzdkEjEJg3Wt5JxGsQ9uLf_E9w2W16NLBQX$>

Juniper Business Use Only

Juniper Business Use Only