Re: [spring] Comments on draft-spring-segment-routing-policy

"Ketan Talaulikar (ketant)" <ketant@cisco.com> Wed, 25 July 2018 15:58 UTC

Return-Path: <ketant@cisco.com>
X-Original-To: spring@ietfa.amsl.com
Delivered-To: spring@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D59C8130EFC for <spring@ietfa.amsl.com>; Wed, 25 Jul 2018 08:58:53 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -14.51
X-Spam-Level:
X-Spam-Status: No, score=-14.51 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, T_DKIMWL_WL_MED=-0.01, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=cisco.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id tG34NYX9HJ7x for <spring@ietfa.amsl.com>; Wed, 25 Jul 2018 08:58:49 -0700 (PDT)
Received: from rcdn-iport-6.cisco.com (rcdn-iport-6.cisco.com [173.37.86.77]) (using TLSv1.2 with cipher DHE-RSA-SEED-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 602EE130E89 for <spring@ietf.org>; Wed, 25 Jul 2018 08:58:49 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=49492; q=dns/txt; s=iport; t=1532534329; x=1533743929; h=from:to:subject:date:message-id:references:in-reply-to: mime-version; bh=xKyd9LdrkJJlwYakTapj6rzJFYqyFhX2a2yWpFYVJvg=; b=Qd2YfzMVkOwM5RGfbT4sVE5Odb0KlUUronaZv8/YbaVM1vsipmiKkDkc +PbiKPd9d1zEUYa9zR4WJ+w6Og1rGMbiohzPxX2fUxLJE9ZCK5jH0UFF2 ghhw04BL5QIXLopRbgz2M5V4nwC1pgYM1qzqZS6787YGzDwT8AClAPqzS M=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: A0B7AgDznFhb/4kNJK1cGgEBAQEBAgEBAQEIAQEBAYJXTCpjfygKg3SUQYIMlV2BZgsshEACF4JPITcVAQIBAQIBAQJtKIU2AQEBBBoJBAY4ChoCAQgRAQMBASEBCQICAjAXBggCBAESCIMZgRtksBp7M4pZiQIXgUE/gRABghR+hDksPoJbglUCh2aEe4Eti2cJAo8rgU6EGYJwhTCSBgIRFIEkMyJAgRJwFTuCaYIlF3oBAwSNFW+BFoxAAYEaAQE
X-IronPort-AV: E=Sophos;i="5.51,401,1526342400"; d="scan'208,217";a="428869692"
Received: from alln-core-4.cisco.com ([173.36.13.137]) by rcdn-iport-6.cisco.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 25 Jul 2018 15:58:47 +0000
Received: from XCH-RCD-010.cisco.com (xch-rcd-010.cisco.com [173.37.102.20]) by alln-core-4.cisco.com (8.15.2/8.15.2) with ESMTPS id w6PFwlGP029336 (version=TLSv1.2 cipher=AES256-SHA bits=256 verify=FAIL); Wed, 25 Jul 2018 15:58:47 GMT
Received: from xch-aln-008.cisco.com (173.36.7.18) by XCH-RCD-010.cisco.com (173.37.102.20) with Microsoft SMTP Server (TLS) id 15.0.1320.4; Wed, 25 Jul 2018 10:58:46 -0500
Received: from xch-aln-008.cisco.com ([173.36.7.18]) by XCH-ALN-008.cisco.com ([173.36.7.18]) with mapi id 15.00.1320.000; Wed, 25 Jul 2018 10:58:46 -0500
From: "Ketan Talaulikar (ketant)" <ketant@cisco.com>
To: Rob Shakir <robjs=40google.com@dmarc.ietf.org>, "draft-ietf-spring-segment-routing-policy@tools.ietf.org" <draft-ietf-spring-segment-routing-policy@tools.ietf.org>, SPRING WG List <spring@ietf.org>
Thread-Topic: [spring] Comments on draft-spring-segment-routing-policy
Thread-Index: AQHUI5HmyIneqjJ3e02Ua/Ti9RhNgKSgGXiA
Date: Wed, 25 Jul 2018 15:58:46 +0000
Message-ID: <13e1dea1451d4a88937e7385f29a148b@XCH-ALN-008.cisco.com>
References: <CAHd-QWu9VzrvSxo_NMwsigo9v8i=_QzYA+hc3eJf65q215Ff=A@mail.gmail.com>
In-Reply-To: <CAHd-QWu9VzrvSxo_NMwsigo9v8i=_QzYA+hc3eJf65q215Ff=A@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-ms-exchange-transport-fromentityheader: Hosted
x-originating-ip: [161.44.213.21]
Content-Type: multipart/alternative; boundary="_000_13e1dea1451d4a88937e7385f29a148bXCHALN008ciscocom_"
MIME-Version: 1.0
X-Outbound-SMTP-Client: 173.37.102.20, xch-rcd-010.cisco.com
X-Outbound-Node: alln-core-4.cisco.com
Archived-At: <https://mailarchive.ietf.org/arch/msg/spring/op7P9wBWE34BjpkZwq0MDFtrXnk>
Subject: Re: [spring] Comments on draft-spring-segment-routing-policy
X-BeenThere: spring@ietf.org
X-Mailman-Version: 2.1.27
Precedence: list
List-Id: "Source Packet Routing in NetworkinG \(SPRING\)" <spring.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/spring>, <mailto:spring-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/spring/>
List-Post: <mailto:spring@ietf.org>
List-Help: <mailto:spring-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/spring>, <mailto:spring-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 25 Jul 2018 15:59:03 -0000

Hi Rob,

Thanks a lot for your detail review and comments. I will work on them and get back.

Thanks,
Ketan

From: spring <spring-bounces@ietf.org> On Behalf Of Rob Shakir
Sent: 24 July 2018 17:04
To: draft-ietf-spring-segment-routing-policy@tools.ietf.org; SPRING WG List <spring@ietf.org>
Subject: [spring] Comments on draft-spring-segment-routing-policy

(As an individual contributor)

Hi Authors,

I have a number of comments on draft-spring-segment-routing-policy, some are technical, some are editorial.  Frankly, I find this document quite difficult to read since it doesn't coherently make up a specification - and rather jumps about quite a lot, with a number of things being re-stated. I understand that this happens during development of a specification, but now this is a WG doc, we should focus on making this document as clear as it can be.

Please find my specific comments below:

  *   (1. Introduction) In this section you use the wording "augmented with an ordered list of segments" - I would suggest rewording this. Whilst it might be the case for SRv6, for SR-MPLS, either we're doing pop+push, or simply pushing the new segment list. In this case, it's not really "augmenting", but adding a new stack of headers.
  *   (2. SR Policy) The introduction to this section isn't really clear to me.

     *   What is meant by a "specific intent"? I think that multiple "intents" might have the same SR policy - e.g., some latency optimisation and IGP shortest path are different "intents", but are likely to mean the same policy. Consider rewording this to a "common pathing behaviour" or similar.
     *   Why do we need to refer to what the SR architecture says about instructions? It seems to me here the core point is "An SR Policy may consist of any type of segment". Is there a technical reason to not just word this more simply?

  *   (2.1 Identification of an SR policy) Where there is a <headend, endpoint, colour> tuple, I believe that the intention here is to say that the headend IPv[46] address is domain unique -- is this expected, if so, should it be stated?
  *   (2.1) "An implementation MAY allow assignment of a symbolic name" - the MUST NOT that is specified here is interesting. I believe Harish asked a similar question in the working group. What is the intention with the MUST NOT? Operationally, should these names be unique per policy (otherwise, it seems they are probably not that user friendly for identification).
  *   (2..2) There is some repetition in this section which makes it quite unclear -- I believe the key points are:

     *   A policy has a set of candidate paths.
     *   Those candidate paths can be dynamic or explicit.
     *   Candidate paths comprise of a set of one or more SID lists.
     *   Explicit candidate paths have externally (to the router) specified SID lists. Dynamic ones are computed on the SID lists.
     *   Within each SID list, traffic is load-balanced across the SID-lists according to a weight.

Is this section saying something else other than the above -- if not, I recommend rewording it to not to have as many decision criteria in it to parse.

  *   (2.2) The whole sub-paragraph about replication is hugely underspecified. We say above that traffic is load-balanced across the SID lists - but then we say that traffic might be replicated. The IDR draft, and this draft have no further mention of "replication". I would highly recommend removing this section, since I think specifying this fully requires significantly more work, and is better done in another draft.
  *   (2.3) I would recommend that the "Local" should simply say "via configuration". "Yang model through NETCONF" isn't accurate since YANG and NETCONF are not coupled, and gRPC here is ambiguous
  *   (2.3 + 2.12) There are a lot of priorities and preferences going on in this document.

     *   We have protocol origin, preference, and then priority. I would recommend restructuring to discuss clearly how these all relate to each other.
     *   From my parsing, your intention is to say:

        *   Within an individual SR policy the candidate path to select is specified using preference. If this is the case, then the load-sharing section in 2.2 should state that this is only when the candidate paths are of the same preference.

           *   See further comments on candidate paths and backups below.

        *   If multiple SR policies that have the same <colour, endpoint> are specified, then the protocol-origin is used to select between them. You should explicitly state whether protocol-origin is better as lower or higher.
        *   There is some head-end specific recomputation priority that is used only for dynamic candidate paths.. If this is the case, then I'd recommend explicitly calling this "recomputation-priority".

  *   (2.3+2.12+9.3) There are many spanners thrown into the works throughout this document as to how things actually operate with different preferences. For example:

     *   If I have multiple candidate paths all with the same weight - I think you say we should load-share across the weights. It doesn't seem to be stated when I should failover to the lower preference paths.
     *   My assumption is that I do not compare preference across different protocol origins -- i.e., I pick the protocol-origin that is best, and then select within its candidate paths - such that it is only when ALL BGP (for example) paths have failed, that I go and look at ones configured statically. This would be my preference - but it should be clarified in the doc.
     *   The backup semantics that are in this doc are really preference failover AFAICS? There is mention that "another ... candidate path MAY be designated as a backup for a specific or all ... candidate paths". How does this work? Where is this signalled that a particular policy is a backup for another? How does this then interact with the preferences and protocol origin values?

  *   (2.8) "A candidate path is valid if it is usable" -- you should define usable somewhere. This is the only use of the word "usable" in the document. If it just means that it has met its validity criterion as specified in Section 5, then this should just be stated.
  *   (2.7+2.8+2.9) For clarity, I would collapse these sections into something coherent - breaking them up makes the document less readable.
  *   (2.8) It is only RECOMMENDED that a candidate path for an SR policy is given a different preference. There are then tiebreaker rules that are only MAY.. Please explicitly specify this -- it's going to be a nightmare to manage a multi-vendor network and test this if there are optional ways to tie-break, especially as the number of preferences/priorities et al. means the number combinations is large.
  *   (2.11) This specification of WECMP seems like it says the same thing that is in 2.2 -- can you consider making these coherently stated in the document?
  *   (3) What is the reason to define an SR-DB here? It seems like there is little reference to this elsewhere, and it replicates information from other places - e.g., BGP RIB, MPLS label tables, and IGP-TE database. Do you intend that implementors create some new lookup of this information, or is this conceptual?

     *   This concept is only defined here in the document -- isn't it just an implementation detail that does not need to be standardised?
     *   Why should we replicate the IGP contents in this SRDB?
     *   You are recommending here that information such as topology is learnt from other domains - why is this a requirement for SR-TE policy? It certainly doesn't seem like a base requirement.

To be clear, I would recommend removing this entire section from the document.

  *   (5.2) I find this whole section under-specified again. The PCE draft referenced does not seem to tell me how to specify an optimisation objective, and there doesn't seem to be any explanation of where or how this would be specified at a head-end in configuration. I would recommend removing this from this document as it seems to be either:

     *   Implementation specific -- if it's in configuration. The only difference might be if one were to want to specify a YANG model that specifically discusses how to specify such objectives.
     *   A few years ago, we discussed this whole problem space, and the conclusion seemed to be that having such objectives specified on the head-end would end up with significant code churn - having an opaque identifier that could be handed to the PCE seemed more logical.

  *   (6) Please can we stop writing IETF documents like they are marketing a technology - "X is fundamental to Segment Routing" is not a useful statement. Please objectively state the technical benefit. Either way, this seems like this section can be removed.
  *   (6.2) Please make it a MUST that an alert of some form is generated when the BSID is not available. Otherwise it is entirely opaque to an operator that this wasn't accepted and that their policy is not in use.
  *   (6.2) In the case of BSID stickiness, if CandidatePath1 specifies label 42 (assumed to be in SRLB), and is more preferable than CandidatePath2 then AIUI, we'll install this policy and use label 42. If we assume that there's some second CandidatePath2, which does not specify a BSID. If CandidatePath1 is subsequently withdrawn, even though CandidatePath2 doesn't specify a BSID, label 42 will be retained. If CP1 is now re-advertised, what happens - is this BSID considered "used" even though it's essentially now a dynamic allocation within the SRLB?
  *   (6.4) Specifying binding SIDs can be assigned to any element doesn't seem to have any real benefit of being in this draft. If there are new BSID bindings that actually require standardisation - then I would highly recommend putting in their own draft. there are many questions that come from this short paragraph. For example, how are those binding SIDs considered valid? Is there OAM interworking that is needed with other networks for liveliness detection etc. We explicitly removed a lot of the BSID->RSVP including ERO advertisement in earlier revisions of SR related IGP drafts.
  *   (7) The SR-TE Policy State is defined here to contain the reason that a policy or candidate path is not valid, however, the encodings in ietf-idr-te-lsp-distribution don't seem to say how this should be encoded. Is this the right reference? Is there extension to that work which is required?
  *   (8.1 + 8.2) The document appears to have multiple different places where it discusses validity. Is there a reason that this can't be covered in 2.10 or 2.11? This would be a more natural place for the definition of "drop on invalid" too.
  *   (8.2) Drop on invalid appears underspecified to me here -- should it be maintained when there was an active BSID but all remote ways of advertising it went away, or only when all paths in their entirety are invalid? If the policy did not specify a BSID - i.e., it was dynamically allocated, should this be freed? What is the behaviour when we have IP2MPLS entries that rely on that policy -- they should never fall back to other mechanisms of routing the traffic via an alternate MPLS tunnel, or alternate IP forwarding entry?
  *   (9.2) It's very unclear to me how one actually provisions this link protecting policy. You have <colour, endpoint> as the unique tuple -- what is the policy that one installs using the framework that is defined in this document to define an SR-TE policy for link protection? What's the match criteria for it?
  *   (9.3) This has a similar problem as the point raised above about how one specifies that a policy is a backup for another policy.
Thanks for your review of these issues. Happy to discuss them on the list further.

Kind regards,
r.