[spring] Benjamin Kaduk's Discuss on draft-ietf-spring-segment-routing-policy-17: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Thu, 17 February 2022 06:16 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: spring@ietf.org
Delivered-To: spring@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id D9FBE3A121C; Wed, 16 Feb 2022 22:16:24 -0800 (PST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-spring-segment-routing-policy@ietf.org, spring-chairs@ietf.org, spring@ietf.org, james.n.guichard@futurewei.com, james.n.guichard@futurewei.com
X-Test-IDTracker: no
X-IETF-IDTracker: 7.45.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <164507858486.11948.7818447548279924153@ietfa.amsl.com>
Date: Wed, 16 Feb 2022 22:16:24 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/spring/zEiRCrEFW2G7ujgatd42RtKYXiE>
Subject: [spring] Benjamin Kaduk's Discuss on draft-ietf-spring-segment-routing-policy-17: (with DISCUSS and COMMENT)
X-BeenThere: spring@ietf.org
X-Mailman-Version: 2.1.29
List-Id: "Source Packet Routing in NetworkinG \(SPRING\)" <spring.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/spring>, <mailto:spring-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/spring/>
List-Post: <mailto:spring@ietf.org>
List-Help: <mailto:spring-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/spring>, <mailto:spring-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 17 Feb 2022 06:16:25 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-spring-segment-routing-policy-17: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/blog/handling-iesg-ballot-positions/
for more information about how to handle DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-spring-segment-routing-policy/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

(1) I may just be misunderstanding things, but I'd like to pull on a thread
in §8.4 a bit more.  We say that the headend H learns a BGP route that has a
VPN label V, but then the following procedures seem to say that we install a
route on the appropriate SR Policy P and that when we receive a packet that
matches the route in question, push a label stack including the VPN label,
and send the resulting packet out.  Nowhere do we say to check the VPN
status of the incoming packet, so this seems like it would open a hole in
the VPN by allowing "arbitrary" incoming traffic (not marked as specific to
V) to enter that VPN.  Is the label V filling some other role than
identifying a specific VPN of many VPNs that could run along the route R/r?
(This is the only instance of the phrase "VPN label" in the document, and no
reference is given, so I'm relying heavily on instinct to ascertain the
intent here.)

(2) The security considerations says that this document does not define any
new protocol extensions and (accordingly) does not introduce any further
security considerations.  The first part of this seems false, not least
since we define the meaning of the "CO" bits in the Color Extended
Community.  I'm pretty sure that makes the second part also false, and we
need to discuss the security considerations relating to imposing SR Policies
based only on color and not next-hop.  Alvaro has also noted additional
aspects where security considerations are missing.

(3) The Discriminator as defined in §2.5 does not seem wide enough to be
able to provide the needed properties.  Some later clarification in §2.6
implies that the definition in §2.5 is incomplete and the width is actually
appropriate, but in either case §2.5 seems inadequate in its current form.
(Details in the COMMENT.)

(4) Section 2.11 contains the statement, "A valid SR Policy is instantiated
in the forwarding plane."

Is this a statement of fact (i.e., a consequence of the definition of
"valid") or a mandate for something (e.g., the headend) to take action to
make it so?  Given that the point of SR is to be stateless on nodes other
than the headend, I suspect the former, but if we are relying on the headend
(or some other entity) to take action to ensure this is the case, that needs
to be a clearly stated normative requirement.

(5) Section 8.4 uses the phrase "any AFI/SAFI of LISP [RFC6830]."
There's nothing in the IANA registry for SAFI
(https://www.iana.org/assignments/safi-namespace/safi-namespace.xhtml) about
LISP, and RFC 6830 doesn't talk about SAFI.  What is this referring to?


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

There's a lot of this document that feels like just some informational
discussion of "here are some things that many people do", "here are some
possible things you can do with SR", etc..  There are also a small handful
of places in the document that look to actually be specifying parts of
protocol behavior (I suspect that John has already identified them in his
enumeration), and the overall impression ends up being a bit jumbled,
like there are a bunch of topics stuck together without an overarching theme.
I think the overall content would be more valuable if divided into a tight
"protocol specification" portion that could stay at proposed standard, plus
an informational "architecture details" document that contains the
in-depth exposition that didn't make it into 8402.

This draft would benefit greatly from a terminology section.  I note in the
section-by-section comments several places where a term is first used
without sufficient background/definition, leaving the matter at hand
underspecified for the reader.

Section 2

   An SR Policy is a framework that enables the instantiation of an
   ordered list of segments on a node for implementing a source routing

Really, an SR policy is a *framework*?  I thought an SR policy was a
specific instantiation of a list of segments, or at least that's what I'm
getting from RFC 8402.  Perhaps we should say that the general concept of SR
Policy provides a framework?

Section 2.1

   An SR Policy MUST be identified through the tuple <headend, color,
   endpoint>.  In the context of a specific headend, an SR policy MUST
   be identified by the <color, endpoint> tuple.

These two MUSTs appear to be in (nominal) conflict.  Maybe start the first
one with "absent further context" or "absent the context of a known headend
node"?

   The headend is the node where the policy is instantiated/implemented.
   The headend is specified as an IPv4 or IPv6 address and is expected
   to be unique in the domain.

This is the first instance of the word "domain" in this document.  I suggest
using the introduction to introduce what is meant by the word, even if just
by reference to RFC 8402.

   An implementation MAY allow the assignment of a symbolic name
   comprising printable ASCII [RFC0020] characters (i.e.  0x20 to 0x7E)
   to an SR Policy to serve as a user-friendly attribute for debugging
   and troubleshooting purposes.  [...]

I agree with the other ADs that limiting to US-ASCII is not actually
user-friendly for many users, and that the likelihood of some
implementations not properly enforcing such a limitation to be high.
(Likewise for the other places where symbolic names are admitted.)

Section 2.2

   A dynamic candidate path expresses an optimization objective and a
   set of constraints.  [...]

Down in §5.2 when we discuss validation procedures for dynamic candidate
paths, we say that the optimization problem is solved "for either the
SR-MPLS or the SRv6 data-plane as specified".  Does the data plane need to
be specified as part of the dynamic candidate path itself?

Section 2.3

   in Section 2.9.  The table below specifies the RECOMMENDED default
   values of Protocol-Origin:

I feel like it would be useful to provide some justification for why the
recommended default behavior prefers BGP SR configuration over PCEP, even if
that justification is just "we need to have a clear ordering and this one is
arbitrary".

Section 2.4

   o  Node Address : represented as a 128-bit value.  IPv4 addresses
      MUST be encoded in the lowest 32 bits, and the high-order bits
      MUST be set to zero.

   Its application in the candidate path selection is described in
   Section 2.9.

The tie-breaker procedure for path selection described in §2.9 seems to
always prefer IPv4 originators over IPv6 ones (by virtue of preferring the
smaller value).  I guess if we wanted to change that to prefer IPv6 we have
the option of fc00::/7 (unique-local) or fe80::/10 (link-scoped unicast)
from BCP 153, but it's a bit hard to justify either of those as appropriate
on technical grounds, and since this is just a tie-breaker and the
Preference is explicitly preferred, it seems like this is probably "good
enough" as-is.

Section 2.5

   The Discriminator is a 32-bit value associated with a candidate path
   that uniquely identifies it within the context of an SR Policy from a
   specific Protocol-Origin as specified below:

What are the constraints that underlie the 32-bit requirement here?
It looks like some of the scenarios are going to involve uncoordinated
(random) assignment of these discriminator values (e.g., with the BGP
distribution mechanism, when coming from different BGP peers), and the
birthday-bound collision probability is not negligible for this few bits.
That, in turn, calls into question the "uniquely identifies" property being
claimed.  Or is there some other property that means that only
discriminators from a single issuer will ever need to be compared with each
other (making the allocation "coordinated"), such as being additionally
associated with the originator?
If my initial analysis was incorrect and these are indeed allocated in a
"coordinated" fashion, would it be typical/expected for the allocation to
occur by incrementing a local counter on the originator?  In some situations
such allocation by counter can have security considerations, which
draft-gont-numeric-ids-sec-considerations attempts to cover.

Section 2.8

   A candidate path is usable when it is valid.  A common path validity
   criterion is the validity of any of its constituent Segment-Lists.
   The validation rules are specified in Section 5.

This document claims to target Proposed Standard status; are we really
content to say only that this is "a common" criterion?  Even when we also go
on to flat-out state "the validation rules are specified [below]"?

Section 2.9

   The candidate path selection process operates primarily on the
   candidate path Preference.  A candidate path is selected when it is
   valid and it has the highest preference value among all the candidate
   paths of the SR Policy.

Should this be "among all the valid candidate paths"?  A path that's invalid
is still invalid, even if it has the highest preference value.

   2.  If specified by configuration, prefer the existing installed
       path.

Does "if specified by configuration" refer to the act of applying this rule
at all, or that the existing installed path was one specified by
configuration?

Section 2.11

   The fraction of the flows associated with a given Segment-List is w/
   Sw, where w is the weight of the Segment-List and Sw is the sum of
   the weights of the Segment-Lists of the selected path of the SR
   Policy.

Thank you for stating this clearly!

Section 3

   o  TE Link Attributes (such as TE metric, Shared Risk Link Groups,
      attribute-flag, extended admin group) [RFC5305] [RFC3630].

Is RFC 5329 applicable here as well?

Section 4

   Type E: IPv4 Prefix with Local Interface ID:
         This type allows identification of Adjacency SID or BGP Peer
         Adjacency SID (as defined in [RFC8402]) SR-MPLS label for
         point-to-point links including IP unnumbered links.  The
         headend is required to resolve the specified IPv4 Prefix
         Address to the Node originating it and then use the Local
         Interface ID to identify the point-to-point link whose
         adjacency is being referred to.  The Local Interface ID link
         descriptor follows semantics as specified in [RFC7752].  This

The phrase "local interface ID" does not appear in RFC 7752 (and even "local
interface" appears just once"; please use terminology actually present in
the referred-to document to clarify what is being referenced.

Section 4.1

   When steering unlabeled IPv6 BGP destination traffic using an SR
   policy composed of Segment-List(s) based on IPv4 SIDs, the Explicit
   Null Label Policy is processed as specified in
   [I-D.ietf-idr-segment-routing-te-policy]) Section 2.4.4.  When an

It looks like this is §2.4.5, not 2.4.4, in the referenced document.

Section 5.1

   The computation/logic that leads to the choice of the Segment-List is
   external to the SR Policy headend.  The SR Policy headend does not
   compute the Segment-List.  The SR Policy headend only confirms its
   validity.

Does the headend actually have to confirm validity?  Is it okay to just
trust the controller and blindly use what is provided?

Section 6.2

   When the active candidate path has a specified BSID, the SR Policy
   uses that BSID if this value (label in MPLS, IPv6 address in SRv6) is
   available (i.e., not associated with any other usage: e.g. to another
   MPLS client, to another SRv6 client, to another SID, to another SR
   Policy, outside the range of SRv6 Locators).

I don't think I understand what is meant by "client" here (for "another
client").  This sentence is the only place where the word "client" appears
in this document...

   Optionally, instead of only checking that the BSID of the active path
   is available, a headend MAY check that it is available within a given
   SID range i.e., Segment Routing Local Block (SRLB) as specified in
   [RFC8402].

Is the only allowed range to check the SRLB?  If not, I think we need to
s/i.e./e.g./.

   When the specified BSID is not available (optionally is not in the
   SRLB), an alert message MUST be generated.

This is the first time (of only two) the word "alert" appears in this
document, and there is no prior expalanation of what entity might be
receiving alerts generated by a headend.  Please clarify.

   Assuming that at time t the BSID of the SR Policy is B1, if at time
   t+dt a different candidate path becomes active and this new active
   path does not have a specified BSID or its BSID is specified but is
   not available (e.g. it is in use by something else), then the SR
   Policy MAY keep the previous BSID B1.

Is there a strict bound on or other guidance for what values of dt are
allowable for this purpose?  Is the intent that there be an atomic
transition from BSID=B1;active-path=P1 to BSID=B1;active-path=P2?

   The association of an SR Policy with a BSID thus MAY change over the
   life of the SR Policy (e.g., upon active path change).  Hence, the
   BSID SHOULD NOT be used as an identification of an SR Policy.

Is there any guidance available on how long to wait with a given BSID value
unused before binding it to a new SR Policy?

Section 6.2.3

   An implementation MAY support the configuration of the Specified-
   BSID-only restrictive behavior on the headend for all SR Policies or
   individual SR Policies.  Further, this restrictive behavior MAY also
   be signaled on a per SR Policy basis to the headend.

Elsewhere in the document we discuss specific potential signaling
mechanisms/protocols, but here we say nothing.  Is that vagueness
intentional?

Section 6.3

   A valid SR Policy installs a BSID-keyed entry in the forwarding plane
   with the action of steering the packets matching this entry to the
   selected path of the SR Policy.

I don't think this is stated properly.  An SR Policy is the list of
segments; it isn't the entity that's installing entries in the forwarding
plane.  Some other entity is installing an entry in the forwarding plane to
realize the SR Policy in question, and we should make our writing reflect
that.

Section 6.4

   An implementation MAY choose to associate a Binding SID with any type
   of interface (e.g. a layer 3 termination of an Optical Circuit) or a
   tunnel (e.g.  IP tunnel, GRE tunnel, IP/UDP tunnel, MPLS RSVP-TE
   tunnel, etc).  This enables the use of other non-SR enabled

Should we have some discussion that contrasts this scenario against the
End.X behavior from RFC 8986 (for the "interface" case)?

Section 7

   The SR Policy State is maintained on the headend to represent the
   state of the policy and its candidate paths.  [...]

I confess I don't really understand why we need to have the current,
minimal, description of SR Policy State in this document.  What would be
lost if we deferred its discussion entirely until there is a more
comprehensive discussion available?

   The SR Policy state can be reported by the headend node via BGP-LS
   [I-D.ietf-idr-te-lsp-distribution] or PCEP [RFC8231] and
   [I-D.ietf-pce-binding-label-sid].

The functionality of draft-ietf-pce-binding-label-sid seems much more
limited than that of draft-ietf-idr-te-lsp-distribution; in particular, the
former does not seem to actually report SR Policy state to the headed at
all; rather, it only concerns itself with BSID association to path, with no
information about "active", "not preferred", etc.

Section 8.3

   If the SR Policy P is invalid, the BSID B is not in the forwarding
   plane and hence the packet K is dropped by H.

We literally just in the previous section talked about a scenario where the
BSDI is kept in the forwarding plane (but with the action to drop, so the
overall outcome is not changed from what this text describes).  Nonetheless,
it's inaccurate to state that "the BSID B is not in the forwarding plane"
here.

Section 8.4.1

   When a BGP route has multiple Color Extended communities each with a
   valid SR Policy, the BGP process installs the route on the SR Policy
   giving preference to the color with the highest numerical value.

Do we want to say anything about this being an arbitrary tiebreaker (rather
than an intentional preference), or is that thought to be implicitly clear?

Section 8.5

   In this section, independent of the a-priori existence of any
   explicit candidate path of the SR policy (C, N), it is to be noted
   that the BGP process at headend node H triggers the instantiation of
   a dynamic candidate path for the SR policy (C, N) as soon as:

I strongly suggest providing a more explicit framework of what the
assumptions and preconditions are for the mechanism described in this
section.  My intuition says that it's a fairly optional thing that would
need to be specifically configured, but trying to wrap that sentiment into
the long bullet point involving "a local policy" seems like a very confusing
way to express the desired behavior.

Section 8.6

   o  is configured to instantiate an array of paths to N where the
      entry 0 is the IGP path to N, color C1 is the first entry and
      Color C2 is the second entry.  The index into the array is called
      a Forwarding Class (FC).  The index can have values 0 to 7.

Why are the only allowed values 0 to 7?  Where does this restriction arise
from?  It is because of some protocol element?

   If the local configuration does not specify any explicit forwarding
   information for an entry of the array, then this entry is filled with
   the same information as entry 0 (i.e. the IGP shortest path).

   If the SR Policy mapped to an entry of the array becomes invalid,
   then this entry is filled with the same information as entry 0.  When
   all the array entries have the same information as entry0, the
   forwarding entry for N is updated to bypass the array and point
   directly to its outgoing interface and next-hop.

I can't tell how much of this is supposed to be protocol specification and
how much an illustrative example.  Is A(0) always the IGP shortest path?
Are these protocol requirements to fall back to the IGP shortest path when
an entry is otherwise unpopulated or the associated SR Policy becomes
invalid?

Section 8.8.2

   The steering preference is first based on the highest color value and
   then CO-dependent for the color.  [...]

This seems to contradict what I assumed earlier about the "highest color"
rule being a tiebreaker, e.g., the word "preference" is used here.  Is it
actually intended to be a deliberately configured priority/preference
scheme?  If so, that would seem to require some wide-ranging reworking
throughout the document.

Section 9.3

   the most appropriate alternative for the active candidate path.  A
   fast re-route mechanism MAY then be used to trigger sub 50msec
   switchover from the active to the backup candidate path in the
   forwarding plane.  Mechanisms like Bidirectional Forwarding Detection
   (BFD) MAY be used for fast detection of such failures.

Why is the specific 50msec value important here?  Is there some other
requirement that imposes it?

Section 10

I think we also want to mention the security considerations of several more
documents, including (but not limited to)
draft-ietf-idr-segment-routing-te-policy and RFCs 8660, 8754, and 8986.

Section 15.2

I agree with John that draft-ietf-idr-segment-routing-te-policy must be
classified as a normative reference.

It also seems that RFC 7752 should be classified as normative, as we
incorporate its definition for the semantics of several of the segment type
descriptions.


NITS

Section 1

   Segment Routing Policy (SR Policy) [RFC8402] is an ordered list of
   segments (i.e. instructions) that represent a source-routed policy.

/Segment/A Segment/

   The headend node is said to steer a flow into a SR Policy.  The
   packets steered into an SR Policy carry an ordered list of segments
   associated with that SR Policy.  [...]

In a certain sense this can be read as saying that the packets that "carry
an ordered list of segments" are the ones prior to being steered into an SR
policy, which would make this statement not true.  Perhaps we want to say
"after being steered into an SR Policy, packets carry an ordered list ..."?
(I also went back and forth with myself about whether "packets ... carry"
implies in the payload or not.  I settled on "not" but make this note just
in case I am missing an aspect of that question.)

Section 2

   An SR Policy is a framework that enables the instantiation of an
   ordered list of segments on a node for implementing a source routing

It's easy to read this as saying that all of the segments in the list
instantiated on the single node in question, which I assume is not the
intent.  Probably the easiest way to aid readability here is to split the
sentence up into multiple smaller sentences that are easier to parse.

Section 2.2

   A dynamic candidate path expresses an optimization objective and a
   set of constraints.  The headend (potentially with the help of a PCE)
   computes the solution Segment-List (or set of Segment-Lists) that
   solves the optimization problem.

I'd suggest computes the solution/computes a solution/ for genericity.
A stateful PCE might end up computing a path that is not the optimal one for
this specific optimization problem, due to a desire to cooperate with other
paths in the network, and the "Min-Metric with margin and maximum number of
SIDs" objective in draft-filsfils-spring-sr-policy-considerations doesn't
even have a guaranteed unique best solution.

Section 2.5

   When provisioning is via configuration, this is an implementation's
   configuration model-specific unique identifier for a candidate path.
   The default value is 0.

I'm having a lot of trouble parsing this.  Did we perhaps mean to hyphenate
as "configuration-model-specific"?

Section 2.13

   The SR Policy POL1 is identified by the tuple <headend, color,
   endpoint>.  It has two candidate paths CP1 and CP2.  Each is
   identified by a tuple <protocol-origin, originator, discriminator>.

I suggest (for the last sentence) "identified within the scope of POL1"

   forwarding instantiation of SR policy POL1.  Traffic steered on POL1
   is flow-based hashed on Segment-List <SID11...SID1i> with a ratio
   W1/(W1+W2).

If I read "ratio" I would instinctively think of the ratio of (traffic on
segment list 1)/(traffic on segment list 2), as opposed to the proportion of
all traffic, that would be measured as the indicated W1/(W1+W2).

Section 3

   The attached domain topology may be learned via IGP, BGP-LS or
   NETCONF.

   A non-attached (remote) domain topology may be learned via BGP-LS or
   NETCONF.

I think these are both probably not exhaustive lists, so "e.g." or similar
may be appropriate.

Section 4

   Type C: IPv4 Prefix with optional SR Algorithm:
         The headend is required to resolve the specified IPv4 Prefix
         Address to the SR-MPLS label corresponding to a Prefix SID
         segment (as defined in [RFC8402]).  The SR algorithm (refer to
         Section 3.1.1 of [RFC8402]) to be used MAY also be provided.

   Type D: IPv6 Global Prefix with optional SR Algorithm for SR-MPLS:
         In this case, the headend is required to resolve the specified
         IPv6 Global Prefix Address to the SR-MPLS label corresponding
         to its Prefix SID segment (as defined in [RFC8402]).  The SR
         Algorithm (refer to Section 3.1.1 of [RFC8402]) to be used MAY

These are effectively just the IPv4 and IPv6 incarnations of the same
underlying procedure, right?  Can't we minimize the diff between the
paragraphs further?

Section 5.1

   Additionally, a Segment-List MAY be declared invalid when:

We probably want another word here ("both"?), to specify how the two
conditions are combined.

Section 5.2

   When the local computation is not possible (e.g., a policy's tail-end
   is outside the topology known to the headend) or not desired, the
   headend MAY send path computation request to a PCE supporting PCEP
   extension specified in [RFC8664].

missing article ("the PCEP extension").  I forget if it should be
"extensions" plural.

Section 8.7

   Finally, headend H MAY be configured with a local routing policy
   which overrides any BGP/IGP path and steer a specified packet on an

singular/plural mismatch -- s/steer/steers/