Re: [spring] Benjamin Kaduk's Discuss on draft-ietf-spring-segment-routing-mpls-19: (with DISCUSS and COMMENT)

Ahmed Bashandy <abashandy.ietf@gmail.com> Wed, 17 April 2019 17:33 UTC

To: Benjamin Kaduk <kaduk@mit.edu>, The IESG <iesg@ietf.org>
Cc: draft-ietf-spring-segment-routing-mpls@ietf.org, Shraddha Hegde <shraddha@juniper.net>, spring-chairs@ietf.org, spring@ietf.org
References: <155493383368.22657.17493399361449567071.idtracker@ietfa.amsl.com>
From: Ahmed Bashandy <abashandy.ietf@gmail.com>
Message-ID: <c87710a2-9545-15d0-54a8-ddae87395324@gmail.com>
Date: Wed, 17 Apr 2019 10:33:18 -0700
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.14; rv:52.0) Gecko/20100101 Thunderbird/52.7.0
MIME-Version: 1.0
In-Reply-To: <155493383368.22657.17493399361449567071.idtracker@ietfa.amsl.com>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Transfer-Encoding: 8bit
Content-Language: en-US
Archived-At: <https://mailarchive.ietf.org/arch/msg/spring/ltUdvsEpjzkOXU6gjgVqkxynOi8>
Subject: Re: [spring] Benjamin Kaduk's Discuss on draft-ietf-spring-segment-routing-mpls-19: (with DISCUSS and COMMENT)
Precedence: list

Thanks a lot for the valuable comments

I just published version 20

See inline "#Ahmed". I left some of the nits to the editor when the RFC 
is about to be published

thanks

Ahmed
On 4/10/19 3:03 PM, Benjamin Kaduk via Datatracker wrote:
> Benjamin Kaduk has entered the following ballot position for
> draft-ietf-spring-segment-routing-mpls-19: Discuss
>
> When responding, please keep the subject line intact and reply to all
> email addresses included in the To and CC lines. (Feel free to cut this
> introductory paragraph, however.)
>
>
> Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
> for more information about IESG DISCUSS and COMMENT positions.
>
>
> The document, along with other ballot positions, can be found here:
> https://datatracker.ietf.org/doc/draft-ietf-spring-segment-routing-mpls/
>
>
>
> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
>
> (pro forma) Six authors is more than five, which per RFC 7322 may require
> discussion.
>
> I have a few questions about whether we need to have more stringent or
> more specific requirements listed.
>
> In Section 2:
>
>     An implementation SHOULD check that an IGP node-SID is not associated
>     with a prefix that is owned by more than one router within the same
>     routing domain. If so, it SHOULD NOT use this Node-SID, MAY use
>     another one if available, and SHOULD log an error.
#Ahmed: the paragraph is removed as RFC8402 as Alvaro suggested
>
> While it's not entirely clear to me that we need to mandate checking
> (the "SHOULD check"), I have a hard time understanding why we would
> allow a known-bad SID to be used ("SHOULD NOT use this Node-SID").
> Shouldn't that be a "MUST NOT", since using it could break the SR
> abstraction?
>
> In Section 2.5:
>
>     5. The remaining FECs with the default algorithm (see the
>        specification of prefix-SID algorithm [RFC8402]) are installed in
>        the FIB natively, such as pure IP entries in case of Prefix FEC,
>        without any incoming labels corresponding to their SIDs. The
>        remaining FECs with a non-zero algorithm are not installed in the
>        FIB.
>
> I didn't really find where in RFC 8402 we assigned numerical values to
> the prefix-SID algorithms, such that "non-zero algorithm" was
> well-defined.  Should I be looking somewhere else for this?
#Ahmed: I agree. I changed the wording to use "shortest path first" and 
referred to RFC8402
>
> In Section 2.5.1: I left several notes in the COMMENT section, but I
> think I can summarize the point to "it seems like we are defining a
> mapping from attributes of a given FEC/description to a byte string and
> applying an ordering to that byte string.  But we don't fully specify
> how all the bits are encoded in that byte string, and it looks like we
> can end up with byte strings of a different length, so the comparison
> rule is not necessarily clear in that case."  This seems fairly related
> to Alvaro's point (2).
>
> In Appendix A.1
>
>         | Local IGP SID allocated dynamically by R2                 |
>         |                     for its "north" adjacency to R3: 9001 |
>         |                     for its "north" adjacency to R3: 9003 |
>         |                     for its "south" adjacency to R3: 9002 |
>         |                     for its "south" adjacency to R3: 9003 |
>
> 9003 is duplicated for different adjacencies?  Isn't that a strongly
> disrecommended scenario?
#Ahmed: Corrected (thanks for catching the mistake)
>
>
> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------
>
> It seems that we're introducing something of a new concept in this
> document of "routing instance" as something with a numerical identifier.
> (That is, this does not appear in RFC 8402 or RFC 3031, in terms of
> what references I might expect to be in scope.)  Am I just missing some
> other reference where this is introduced?  If not, maybe it is worth
> mentioning in a terminology section.
#Ahmed: the numerical values are used for the purpose of illustrating 
the tie breaking rules. What matters is the ordering of the tie-breaking 
rules
>
> [I think some of these section-by-section notes were spotted already;
> I didn't get a chance to deduplicate.]
>
> Section 2
>
>     In order to have a node segment to reach the node, a network operator
>     SHOULD configure at least one node segment per routing instance,
>     topology, algorithm. [...]
>
> nit: maybe "per tuple of [...]"?
>
> Section 2.2
>
>     o  The label value MUST be unique within the router on which the MCC
>        is running. i.e. the label MUST only be used to represent the SID
>        and MUST NOT be used to represent more than one SID or for any
>        other forwarding purpose on the router.
>
> Maybe I'm misreading the intent, but "MUST be unique" seems like it's a
> requirement from core MPLS and need not be restated.
#Ahmed: This statement emphasizes the need for the tie-breaking rules in 
Section 2.5
>
> Section 2.3
>
>                                                                 The rules
>     applicable to the SRGB are also applicable to the SRLB, except rule
>     that says that the SRGB MUST only be used to instantiate global SIDs
>     into the MPLS forwarding plane. [...]
>
> nit: "except the rule"
#Ahmed: Done
>
> Section 2.4
>
> I'd consider writing the algorithm in real code (python?) rather than
> abstract pseudocode.  In some cases (though probably not here?)
> pseudocode makes it easy to miss edge cases that need to be specified in
> order for things to be interoperably implementable.
>
> Section 2.5
>
>     MPLS Architecture [RFC3031] defines Forwarding Equivalence Class
>     (FEC) term as the set of packets with similar and / or identical
>     characteristics which are forwarded the same way and are bound to the
>     same MPLS incoming (local) label. In Segment-Routing MPLS, local
>     label serves as the SID for given FEC.
>
> nit: there's some missing (in)definite articles here; "The MPLS
> Architecture", "the local label", "a given FEC".  (And it probably reads
> better as "defines the term [FEC]" than putting "term" after the name of
> the term.
#Ahmed: Fixed
>
>     o  (Prefix, Routing Instance, Topology, Algorithm [RFC8402]), where a
>        topology identifies a set of links with metrics. For the purpose
>        of incoming label collision resolution, the same Topology
>        numerical value SHOULD be used on all routers to identify the same
>        set of links with metrics. [...]
>
> Is the IGP going to help me satisfy this SHOULD or is it more of a
> pie-in-the-sky sort of thing?
#Ahmed: Well, it is just like all "SHOULDs" in all RFCs and drafts
>
> Section 2.5.1
>
>     This document defines the default tie breaking rules that SHOULD be
>     implemented. An implementation MAY choose to support different tie-
>     breaking rules and MAY use one of the these instead of the default
>     tie-breaking rules. All routers in a routing domain SHOULD use the
>     same tie-breaking rules to maximize forwarding consistency.
>
> I didn't think through this hard enough to come up with a specific
> scenario that would fail, but it seems like there could be bad failure
> modes when forwarding consistency is not maintained.  That would perhaps
> suggest a "MUST" requirement to use the same rules, and perhaps even
> announcement of an identifier for what rules are in use, so that peers
> can detect an inconsistency.
#Ahmed:
As mentioned more clearly in page 9 in the latest version and in 
previous versions, the objective of the tie-breaking rules is 
determinism: I.e. if the same set of FECs are mapped to a given label 
"L1", the a router always maps the same label to the same FEC, 
irrespective of the order by which these mappings are received.

Besides the use of MUST vs SHOULD in this paragraph was extensively 
discussed over the mailing list and the consensus at that time is to 
keep it as SHOULD because SR-MPLS is already widely deployed and putting 
a MUST instead of a SHOULD will not cause anyone to modify the 
implementation simply because the source of the problem is an operator 
error in the first place

>
>     The default FEC administrative distance order starting from the
>     lowest value SHOULD be
>
> I think it would be nice if we could get this to be an "is" rather than
> a "SHOULD be", especially since at present we offer no guidance on
> actually constructing the required 8-bit numerical values.
#Ahmed: The default tie-breaking rules are SHOULD as mentioned above
>
>     The numerical sort across FECs SHOULD be performed as follows:
>
> It seems like the first two top-level bullets here are not necessarily
> part of the procedure itself, but rather some ancillary information
> about how to compute values used as part of the procedure.  I don't know
> if, editorially speaking, the presentation could be improved by
> reframing how these are discussed.
>
>         o All prefixes are represented by (128 + 8) bits.
>
>              . A prefix is encoded in the most significant bits and the
>                 remaining bits are set to zero.
>
>              . The prefix length is encoded before the prefix in a field
>                 of size 8 bits.
>
> This description seems needlessly confusing.  Couldn't we write it as
> (8+128) bits, and put the sub-bullet for the prefix length before the
> other sub-bullet, so that they appear in the order the bits are encoded?
#Ahmed: Changed to (8 + 128) as suggested
>
>     o  Encode the remaining set of FECs as follows
>
>         o Prefix, Routing Instance, Topology, Algorithm: (Prefix Length,
>           Prefix, routing_instance_id, Topology, SR Algorithm,)
>
>         o (next-hop, outgoing interface): (next-hop,
>           outgoing_interface_id)
>
>        o (number of adjacencies, list of next-hops in ascending
>           numerical order, list of outgoing interface IDs in ascending
>           numerical order). This encoding is used to encode a parallel
>           adjacency [RFC8402]
>
>         o (Endpoint, Color): (Endpoint_address, Color_id)
>
>         o (IP address): This is the encoding for a mirror SID FEC. The IP
>          address is encoded as described above in this section
>
> I think this needs to say a little bit more about what is being
> presented.  The part before the colon is what we're using to label a
> category of FECs, and the part after the colon is how it is encoded?
> There might be a more formal description language to describe the
> encoding rules used, and also the (number of adjacencies, list of
> next-hops) bullet point doesn't have a colon.
#Ahmed I added "is encoded as" to the 1st, 2nd and 4th bullet to specify 
what is being encoded
>
> We also don't specify that big-endian (network byte order) is used.
#Ahmed: Remember that the objective is to have deterministic mapping of 
a label to a certain FEC within the same router. Any CPU on which these 
rules are executed will have one endianness
>
>     o  Select the FEC with the smallest numerical value
>
> If I understand correctly, we are encoding these FECs to byte strings,
> but different types of FEC get encoded as different length byte strings.
> How do we then interpret these byte strings as numerical values?
#Ahmed: I hope Martin's response is satisfactory
>
> Section 2.6
>
>                                                      However to minimize
>     the chance of misforwarding, a FEC that loses its incoming label to
>     the tie-breaking rules specified in Section 2.5 MUST NOT be
>     installed in FIB with an outgoing segment routing label based on the
>     SID corresponding to the lost incoming label.
>
> It's not entirely clear to me how actionable this requirement is.
> That is, is the entity instaslling the FIB entry always going to know
> that the outgoing label was "based on" the incoming (non-)label?
#Ahmed: That is an implementation detail. For example, an MCC (e.g. 
ISIS, OSPF) can have "flags" or "type" attached to the incoming and/or 
outgoing label(s) downloaded to the "conflict resolution component" to 
indicate the source of  of the label(s). Again that is an implementation 
detail and I am sure any average SW engineer can implement other (and 
possibly better) ideas
>
> Section 2.7.1
>
> Setting TTL and TC improperly can have security considerations.
> This document does not discuss those, nor does RFC 8402 (the only
> reference listed from this document's security considerations).
>
> Section 4
>
> "OAM" is not listed as "well-known" at
> https://www.rfc-editor.org/materials/abbrev.expansion.txt and would
> typically qualify for expansion on first usage.
>
> Section 5
>
> [see also comment on Section 2.7.1]
>
> Should we mention that different routers can get different results from
> the tie-breaking rules in case of skew in IGP convergence?
#Ahmed: I do not think it is necessary. It is well known that if a IGP 
convergence is not consistent, then there will be a lot of inconsistent 
forwarding not only because of different tie-breaking rules
>
> Appendix A.1
>
>     The packet arrives at router R2. Because the top label 1008
>     corresponds to the IGP SID "8", which is the prefix-SID attached to
>     the prefix 192.0.2.8/32 owned by the node R8, then the instruction
>     associated with the SID is "forward the packet using all ECMP/UCMP
>     interfaces and all ECMP/UCMP next-hop(s) along the shortest/useable
>     path(s) towards R8". Because R2 is not the penultimate hop, R2
>     applies the CONTINUE operation to the packet and sends it to R3 using
>     one of the two links connected to R3 with top label 1008 as specified
>     in Section 2.10.1.
>
> "one of the two links" seems inconsistent with the claimed "using all
> ECMP/UCMP interfaces and all ECMP/UCMP next-hop(s)".
#Ahmed: not really. A given packet is always sent over only one of the 
ECMP/UCMP links/next-hops
>
>                                                              Because R3
>     is the penultimate hop, we assume that R3 performs penumtimate hop
>     popping, which corresponds to the NEXT operation, then sends the
>     packet to R8. [...]
>
> This chain of causality doesn't follow.  We assume that R3 performs PHP
> -- the fact that in this flow R3 is the penultimate hope does not factor
> into that assumption.
#Ahmed: why doesn't it factor into the assumption? We cannot assume that 
non-penultimate router performs penultimate hop popping. I.e. we can 
make the assumption of "penultimate-hop popping" ONLY if the router is 
the "penultimate-hop router". Besides it is an "assumption" because in 
general a penultimate hop router need not always perform penultimate hop 
popping.
>
> Appendix A.2.5
>
>     Since both FECs are from the same MCC, they have the same default
>     admin distance. So we compare FEC type code-point. FEC1 has FEC type
>     code-point=120, while FEC2 has FEC type code-point=130. Therefore,
>     FEC1 wins.
>
> nit: It feels a little strange to call these code-points when there's no
> registry and they're locally assigned per site policy.
>
> Appendix A.2.6
>
>     FEC1 and FEC2 both use dynamic SID assignment. Since both FECs are
>     from the same MCC, they have the same default admin distance. So we
>     compare FEC type code-point. Both FECs have FEC type code-point=120.
>     So we compare address family. Since IPv4 is preferred over IPv6, FEC1
>     wins.
>
> side note: It's a little surprising that "IPv4 is preferred over IPv6"
> did not get any objections at IETF LC.  (Example 13 has the same
> property.)
>
>

[spring] Benjamin Kaduk's Discuss on draft-ietf-s… Benjamin Kaduk via Datatracker
Re: [spring] Benjamin Kaduk's Discuss on draft-ie… Ahmed Bashandy