Re: [spring] Comments on draft-spring-segment-routing-policy

"Siva Sivabalan (msiva)" <msiva@cisco.com> Fri, 19 October 2018 01:42 UTC

From: "Siva Sivabalan (msiva)" <msiva@cisco.com>
To: "spring@ietf.org" <spring@ietf.org>
CC: "robjs=40google.com@dmarc.ietf.org" <robjs=40google.com@dmarc.ietf.org>
Thread-Topic: [spring] Comments on draft-spring-segment-routing-policy
Thread-Index: AQHUI5HnrmwGBsDxVEmQ6kno4nzKBaUilxmAgAOi+5A=
Date: Fri, 19 Oct 2018 01:42:49 +0000
Message-ID: <9ba2813b9f9b4800baf4ba8255242d6b@XCH-ALN-011.cisco.com>
References: <CAHd-QWu9VzrvSxo_NMwsigo9v8i=_QzYA+hc3eJf65q215Ff=A@mail.gmail.com> <5424EA3D-A726-4D96-950E-9C63ED93D49B@cisco.com>
In-Reply-To: <5424EA3D-A726-4D96-950E-9C63ED93D49B@cisco.com>
Accept-Language: en-US
Content-Language: en-US
Content-Type: multipart/alternative; boundary="_000_9ba2813b9f9b4800baf4ba8255242d6bXCHALN011ciscocom_"
MIME-Version: 1.0
Archived-At: <https://mailarchive.ietf.org/arch/msg/spring/rwQC6ONUtJBa7iy0X9dem_OISbw>
Subject: Re: [spring] Comments on draft-spring-segment-routing-policy
Precedence: list

Hi Rob,

Many thanks for your comments.

Please see our response in-line.

Thanks,
Siva
p.s: We will be posting a new revision of the SR policy draft reflecting your comments.


From: spring <spring-bounces@ietf.org<mailto:spring-bounces@ietf.org>> on behalf of Rob Shakir <robjs=40google.com@dmarc.ietf.org<mailto:robjs=40google.com@dmarc.ietf.org>>
Date: Tuesday, July 24, 2018 at 5:04 PM
To: "draft-ietf-spring-segment-routing-policy@tools.ietf.org<mailto:draft-ietf-spring-segment-routing-policy@tools.ietf.org>" <draft-ietf-spring-segment-routing-policy@tools.ietf.org<mailto:draft-ietf-spring-segment-routing-policy@tools.ietf.org>>, SPRING WG List <spring@ietf.org<mailto:spring@ietf.org>>
Subject: [spring] Comments on draft-spring-segment-routing-policy

(As an individual contributor)

Hi Authors,

I have a number of comments on draft-spring-segment-routing-policy, some are technical, some are editorial.  Frankly, I find this document quite difficult to read since it doesn't coherently make up a specification - and rather jumps about quite a lot, with a number of things being re-stated. I understand that this happens during development of a specification, but now this is a WG doc, we should focus on making this document as clear as it can be.

Please find my specific comments below:

  *   (1. Introduction) In this section you use the wording "augmented with an ordered list of segments" - I would suggest rewording this. Whilst it might be the case for SRv6, for SR-MPLS, either we're doing pop+push, or simply pushing the new segment list. In this case, it's not really "augmenting", but adding a new stack of headers.
For SR-MPLS, when we steer into the SR Policy we are doing an imposition of the label stack corresponding the SID list on the packet. So we are adding labels. Since the word imposition has special connotations with MPLS, we thought that “augment” would be a neutral word that fits both SRv6 and SR-MPLS.

  *   (2. SR Policy) The introduction to this section isn't really clear to me.

     *   What is meant by a "specific intent"? I think that multiple "intents" might have the same SR policy - e.g., some latency optimisation and IGP shortest path are different "intents", but are likely to mean the same policy. Consider rewording this to a "common pathing behaviour" or similar.
You are right that a SR Policy with latency optimization intent and IGP shortest path intent may have the same path – and hence the same “pathing behavior”. But their intent is different and quite likely with some network topology or other changes, their path would diverge. Hence the emphasis on “intent” rather than the “pathing”.

     *   Why do we need to refer to what the SR architecture says about instructions? It seems to me here the core point is "An SR Policy may consist of any type of segment". Is there a technical reason to not just word this more simply?
The reason is to simply remind that the instructions are not just topological (e.g. for a TE use-case) but also wider to including others like service segments.

  *   (2.1 Identification of an SR policy) Where there is a <headend, endpoint, colour> tuple, I believe that the intention here is to say that the headend IPv[46] address is domain unique -- is this expected, if so, should it be stated?
Agree and will add text to clarify the same.

  *   (2.1) "An implementation MAY allow assignment of a symbolic name" - the MUST NOT that is specified here is interesting. I believe Harish asked a similar question in the working group. What is the intention with the MUST NOT? Operationally, should these names be unique per policy (otherwise, it seems they are probably not that user friendly for identification).
We will update the text to remove the MUST NOT and clarify the usage.

  *   (2..2) There is some repetition in this section which makes it quite unclear -- I believe the key points are:

     *   A policy has a set of candidate paths.
     *   Those candidate paths can be dynamic or explicit.
     *   Candidate paths comprise of a set of one or more SID lists.
     *   Explicit candidate paths have externally (to the router) specified SID lists. Dynamic ones are computed on the SID lists.
     *   Within each SID list, traffic is load-balanced across the SID-lists according to a weight.

Is this section saying something else other than the above -- if not, I recommend rewording it to not to have as many decision criteria in it to parse.
We will remove a repeated sentence in there to improve the text.

  *   (2.2) The whole sub-paragraph about replication is hugely underspecified. We say above that traffic is load-balanced across the SID lists - but then we say that traffic might be replicated. The IDR draft, and this draft have no further mention of "replication". I would highly recommend removing this section, since I think specifying this fully requires significantly more work, and is better done in another draft.
We will remove this text from this draft since this topic is now being worked on in draft-voyer-spring-sr-p2mp-policy, which was not available at the time of writing the SR Policy architecture draft.

  *   (2.3) I would recommend that the "Local" should simply say "via configuration". "Yang model through NETCONF" isn't accurate since YANG and NETCONF are not coupled, and gRPC here is ambiguous
Agree and will update the text.

  *   (2.3 + 2.12) There are a lot of priorities and preferences going on in this document.

     *   We have protocol origin, preference, and then priority. I would recommend restructuring to discuss clearly how these all relate to each other.
Sec 2.9 does this.

     *   From my parsing, your intention is to say:

        *   Within an individual SR policy the candidate path to select is specified using preference. If this is the case, then the load-sharing section in 2.2 should state that this is only when the candidate paths are of the same preference.
Selection ensures one and only one CP is selected and load sharing happens between Segment-Lists of the selected candidate path.

           *   See further comments on candidate paths and backups below.

        *   If multiple SR policies that have the same <colour, endpoint> are specified, then the protocol-origin is used to select between them. You should explicitly state whether protocol-origin is better as lower or higher.
It is specified that higher is better. Please see Sec 2.9 where the tiebreaker is explained.

        *   There is some head-end specific recomputation priority that is used only for dynamic candidate paths.. If this is the case, then I'd recommend explicitly calling this "recomputation-priority".
It is not specific to dynamic paths since even explicit paths need to be evaluated for reachability/validity. We will add text to clarify this in Sec 2.12.

  *   (2.3+2.12+9.3) There are many spanners thrown into the works throughout this document as to how things actually operate with different preferences. For example:

     *   If I have multiple candidate paths all with the same weight - I think you say we should load-share across the weights. It doesn't seem to be stated when I should failover to the lower preference paths.
As mentioned above, there is no load-balancing between candidate paths but it is between Segment Lists within a specific candidate paths.

     *   My assumption is that I do not compare preference across different protocol origins -- i.e., I pick the protocol-origin that is best, and then select within its candidate paths - such that it is only when ALL BGP (for example) paths have failed, that I go and look at ones configured statically. This would be my preference - but it should be clarified in the doc.
Preference is compared across origins and has higher priority than the protocol origin. Please see Sec 2.9.

     *   The backup semantics that are in this doc are really preference failover AFAICS?
This is just preference failover and not backup in the sense of fast-reroute. Sec 9.3 talks about the backup in the context of fast-reroute. As such they are independent.
There is mention that "another ... candidate path MAY be designated as a backup for a specific or all ... candidate paths". How does this work? Where is this signalled that a particular policy is a backup for another?
Since this is SR Policy architecture, it does not cover the signaling aspects – those should be worked out in separate documents. I agree that BGP-SRTE for instance, does not include this indication of backup yet. However, there is an interest to add this.
How does this then interact with the preferences and protocol origin values?
The backup notion in Sec 9.3 is independent of tie-breaker. The backup may be picked as the next best candidate path OR a candidate path which has been explicitly signaled as a backup for the best candidate path that was selected.

  *   (2.8) "A candidate path is valid if it is usable" -- you should define usable somewhere. This is the only use of the word "usable" in the document. If it just means that it has met its validity criterion as specified in Section 5, then this should just be stated.
It should have been - A candidate path is usable when it valid. We will correct this.

  *   (2.7+2.8+2.9) For clarity, I would collapse these sections into something coherent - breaking them up makes the document less readable.
Each of them are independent topic and hence have been kept separate for clarity.

  *   (2.8) It is only RECOMMENDED that a candidate path for an SR policy is given a different preference. There are then tiebreaker rules that are only MAY.. Please explicitly specify this -- it's going to be a nightmare to manage a multi-vendor network and test this if there are optional ways to tie-break, especially as the number of preferences/priorities et al. means the number combinations is large.
Will update the text based on your and co-author’s suggestions.

  *   (2.11) This specification of WECMP seems like it says the same thing that is in 2.2 -- can you consider making these coherently stated in the document?
Sec 2.11 talks about instantiation of policy in the forwarding plane and discusses WECMP in that context. While Sec 2.2 brings in weight only to describe the association and intention for having multiple Segment-Lists within a candidate path.

  *   (3) What is the reason to define an SR-DB here? It seems like there is little reference to this elsewhere, and it replicates information from other places - e.g., BGP RIB, MPLS label tables, and IGP-TE database. Do you intend that implementors create some new lookup of this information, or is this conceptual?
This section is conceptual and this is highlighted with the usage of multiple MAY keywords. It helps the reader to understand all the pieces of information that help in computation or validation of SR Policy. We will add text to clarify this.

     *   This concept is only defined here in the document -- isn't it just an implementation detail that does not need to be standardized?
This concept is necessary to clarify to implementers on the multiple pieces of information that “MAY” be required for SR Policy computation or validation.

     *   Why should we replicate the IGP contents in this SRDB?
There is no replication requirement mentioned. That is left to the implementation as this is conceptual.

     *   You are recommending here that information such as topology is learnt from other domains - why is this a requirement for SR-TE policy? It certainly doesn't seem like a base requirement.
It is a MAY. Certainly not a “base” requirement.

                    To be clear, I would recommend removing this entire section from the document.
IMHO it is important conceptual information for implementers and hence relevant in this document.

  *   (5.2) I find this whole section under-specified again.
There was specification for this in the previous version of this draft which was moved to draft-filsfils-spring-sr-policy-considerations (please see Sec 3.1 there) upon your request. Currently we have put a cross-reference to that draft.
The PCE draft referenced does not seem to tell me how to specify an optimisation objective, and there doesn't seem to be any explanation of where or how this would be specified at a head-end in configuration. I would recommend removing this from this document as it seems to be either:
The PCEP draft referenced describes the protocol extensions for SR, which are used for SR Policy. It does not describe the various optimization metrics available in PCEP signaling e.g. RFC5440 and RFC8233.

     *   Implementation specific -- if it's in configuration. The only difference might be if one were to want to specify a YANG model that specifically discusses how to specify such objectives.
The architecture draft is introducing the key concepts and structure. The Yang model itself is covered by a separate document. This specific aspect will be covered in a future version of draft-thomas-spring-sr-policy-yang.

     *   A few years ago, we discussed this whole problem space, and the conclusion seemed to be that having such objectives specified on the head-end would end up with significant code churn - having an opaque identifier that could be handed to the PCE seemed more logical.
I believe you refer to the notion of profile-ID in PCEP for specifying traffic steering and other features. That concept is still applicable to SR policy.

  *   (6) Please can we stop writing IETF documents like they are marketing a technology - "X is fundamental to Segment Routing" is not a useful statement. Please objectively state the technical benefit. Either way, this seems like this section can be removed.
We believe this is a technical point and the reasons are explained in the next sentence – “it provides scaling, network opacity and service independence”. If you think otherwise, please let us know.

  *   (6.2) Please make it a MUST that an alert of some form is generated when the BSID is not available. Otherwise it is entirely opaque to an operator that this wasn't accepted and that their policy is not in use.
Agreed. We will update this in the text.


  *   (6.2) In the case of BSID stickiness, if CandidatePath1 specifies label 42 (assumed to be in SRLB), and is more preferable than CandidatePath2 then AIUI, we'll install this policy and use label 42. If we assume that there's some second CandidatePath2, which does not specify a BSID. If CandidatePath1 is subsequently withdrawn, even though CandidatePath2 doesn't specify a BSID, label 42 will be retained. If CP1 is now re-advertised, what happens - is this BSID considered "used" even though it's essentially now a dynamic allocation within the SRLB?
CP1 will be preferred and since 42 was specified BSID for it, it will use the same BSID.

  *   (6.4) Specifying binding SIDs can be assigned to any element doesn't seem to have any real benefit of being in this draft. If there are new BSID bindings that actually require standardisation - then I would highly recommend putting in their own draft. there are many questions that come from this short paragraph. For example, how are those binding SIDs considered valid? Is there OAM interworking that is needed with other networks for liveliness detection etc. We explicitly removed a lot of the BSID->RSVP including ERO advertisement in earlier revisions of SR related IGP drafts.
This is an important concept for the BSID as it relates to SR Policy and needs to be covered in this architecture draft. A Policy can be an instruction to steer over a specific interface. The relevant examples were explained in a previous version of this draft but were moved into draft-filsfils-sr-policy-considerations based on your input. In fact, the details for the optical SR path use-case described in draft-anand-spring-poi-sr builds on top of this construct.


  *   (7) The SR-TE Policy State is defined here to contain the reason that a policy or candidate path is not valid, however, the encodings in ietf-idr-te-lsp-distribution don't seem to say how this should be encoded. Is this the right reference? Is there extension to that work which is required?
Your observations are correct. This was introduced in draft-ietf-idr-te-lsp-distribution-09. Since this is the architecture document, things are likely to come here first before we work on necessary extensions in the protocols.

  *   (8.1 + 8.2) The document appears to have multiple different places where it discusses validity. Is there a reason that this can't be covered in 2.10 or 2.11? This would be a more natural place for the definition of "drop on invalid" too.
We have put a reference to the sections 2.10 and 5 which describe the validity of SR Policy in the text of section 8.1. The “drop on invalid” is a steering and forwarding plane notion and hence more appropriate under Section 8.

  *   (8.2) Drop on invalid appears underspecified to me here -- should it be maintained when there was an active BSID but all remote ways of advertising it went away, or only when all paths in their entirety are invalid?
This is when all the candidate paths of the SR Policy are invalid and hence the policy is invalid.
If the policy did not specify a BSID - i.e., it was dynamically allocated, should this be freed?
As is mentioned in sec 8.2, for drop-on-invalid SR Policies, the BSID is required and retained in the forwarding but with an action to drop the traffic matching it.

What is the behaviour when we have IP2MPLS entries that rely on that policy -- they should never fall back to other mechanisms of routing the traffic via an alternate MPLS tunnel, or alternate IP forwarding entry?
Since it is marked as drop-on-invalid, then the traffic matching the ip2mpls entries and getting steering into this policy would also get dropped. The “normal” default behavior for SR Policy would be that on becoming invalid, it would be taken out and we would fallback to other alternates or the IP forwarding. So drop-upon-invalid is really a special behavior.

  *   (9.2) It's very unclear to me how one actually provisions this link protecting policy. You have <colour, endpoint> as the unique tuple -- what is the policy that one installs using the framework that is defined in this document to define an SR-TE policy for link protection? What's the match criteria for it?
The SR Policy is locally configured as a backup path for protecting the specific link. We have added text to mention this.

  *   (9.3) This has a similar problem as the point raised above about how one specifies that a policy is a backup for another policy.
Note that 9.3 is about specifying one CP as backup for another within the same SR Policy and so it is different from 9.2. The backup is provisioned and we have added some text to clarify this. This is not yet something covered in the signaling or the Yang model – but something that is necessary for achieving fast-reroute with path protection for SR Policy Candidate paths.
Thanks for your review of these issues. Happy to discuss them on the list further.

Kind regards,
r.

[spring] Comments on draft-spring-segment-routing… Rob Shakir
Re: [spring] Comments on draft-spring-segment-rou… Ketan Talaulikar (ketant)
Re: [spring] Comments on draft-spring-segment-rou… Siva Sivabalan (msiva)