Re: [mpls] Benjamin Kaduk's Discuss on draft-ietf-mpls-lsp-ping-lag-multipath-06: (with DISCUSS and COMMENT)

Loa Andersson <> Wed, 13 March 2019 01:59 UTC

Return-Path: <>
Received: from localhost (localhost []) by (Postfix) with ESMTP id DE9E61311E4; Tue, 12 Mar 2019 18:59:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at
X-Spam-Flag: NO
X-Spam-Score: -1.899
X-Spam-Status: No, score=-1.899 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from ([]) by localhost ( []) (amavisd-new, port 10024) with ESMTP id hM0WQ5SnWL6T; Tue, 12 Mar 2019 18:59:39 -0700 (PDT)
Received: from ( []) (using TLSv1.1 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by (Postfix) with ESMTPS id 5F4131311D1; Tue, 12 Mar 2019 18:59:36 -0700 (PDT)
Received: from [] (unknown []) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: by (Postfix) with ESMTPSA id 071D9180157E; Wed, 13 Mar 2019 02:59:32 +0100 (CET)
To: Benjamin Kaduk via Datatracker <>, The IESG <>
References: <>
From: Loa Andersson <>
Message-ID: <>
Date: Wed, 13 Mar 2019 09:59:29 +0800
User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.5.3
MIME-Version: 1.0
In-Reply-To: <>
Content-Type: text/plain; charset="utf-8"; format="flowed"
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Archived-At: <>
Subject: Re: [mpls] Benjamin Kaduk's Discuss on draft-ietf-mpls-lsp-ping-lag-multipath-06: (with DISCUSS and COMMENT)
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Multi-Protocol Label Switching WG <>
List-Unsubscribe: <>, <>
List-Archive: <>
List-Post: <>
List-Help: <>
List-Subscribe: <>, <>
X-List-Received-Date: Wed, 13 Mar 2019 01:59:52 -0000


The number of authors were discussed and explained in the Shepherds
Write-up. All the 6 authors listed on the front-page has made
text contributions to the document.


On 2019-03-13 04:30, Benjamin Kaduk via Datatracker wrote:
> Benjamin Kaduk has entered the following ballot position for
> draft-ietf-mpls-lsp-ping-lag-multipath-06: Discuss
> When responding, please keep the subject line intact and reply to all
> email addresses included in the To and CC lines. (Feel free to cut this
> introductory paragraph, however.)
> Please refer to
> for more information about IESG DISCUSS and COMMENT positions.
> The document, along with other ballot positions, can be found here:
> ----------------------------------------------------------------------
> ----------------------------------------------------------------------
> Thanks for this document; it's clearly going to provide value, and it
> gives a pretty well-readable description of how things are expected to
> work.  That said, there's a number of details that need to be worked out
> before publication...
> This document has six listed authors; per RFC 7322, documents listing
> more than five authors are unusual, and six is greater than five.  Let's
> talk about the author count.

> ----------------------------------------------------------------------
> ----------------------------------------------------------------------
> Am I reading this correctly that the "LSR Capability TLV" created here
> is basically only applicable to properties of MPLS echo request/reply
> exchanges?  If so, then perhaps the moniker "LSR Capability" is overly
> generic.
> Section 1.2
>     o  Label switching over some member links of the LAG is successful,
>        but will be failed over other member links of the LAG.
> nit: there's some verb tense inconsistency here; maybe "but fails over
> other member links" would help.
> Section 2
>     This document defines an optional TLV to discover the capabilities of
>     a responder LSR and extensions for use with the MPLS LSP Ping and
> nit: is it only the *responder* LSR we care about, or are there
> potentially intermediaries in play?
>     The solution consists of the MPLS echo request containing a DDMAP TLV
>     and the optional LSR capability TLV to indicate that separate load
> nit: Is this "optional capability TLV" the same "optional TLV" that we are
> defining in this document?  If so, calling it "the new optional LSR
> capability" would aid clarity.  (Also, we seem to use both the
> "capability" and "Capability" capitzliations for describing this TLV.)
> Section 3
>     Capability TLV in the MPLS echo request message.  When the responder
>     LSR receives an MPLS echo request message with the LSR Capability TLV
>     included, if it knows the LSR Capability TLV, then it MUST include
>     the LSR Capability TLV in the MPLS echo reply message with the LSR
>     Capability TLV describing features and extensions supported by the
>     local LSR.  Otherwise, an MPLS echo reply MUST be sent back to the
>     initiator LSR with the return code set to "One or more of the TLVs
>     was not understood".  [...]
> This last MUST seems like something that this document cannot mandate
> (as it attempts to apply to non-implementations of this document); it
> could only work if it was an existing requirement from the core MPLS
> spec, in which case lower-case "must" is appropriate (possibly with
> section reference).
>     o  If the responder LSR does not understand the "LAG Description
>        Indicator flag":
>        *  Clear both the "Downstream LAG Info Accommodation flag" and the
>           "Upstream LAG Info Accommodation flag".
> Similarly, if an LSR does not understand a flag, it cannot give special
> treatment to packets containing that flag.  (Why an LSR would not
> understand a flag defined in the same document that defines the
> Capabilities TLV is beyond me, but you seem to want to allow for that
> case.)
>     If the responder does not know the LSR Capability TLV, an MPLS echo
>     reply with the return code set to "One or more of the TLVs was not
>     understood" MUST be sent back to the initiator LSR.
> This duplicates the content I quoted at the top of this section's
> comments.
> Section 4.1
>     Once the initiator LSR knows the capabilities that a responder
>     supports, then it sends an MPLS echo request carrying a DDMAP with
>     the "LAG Description Indicator flag" (G) set to the responder LSR.
> nit: I think the key point is not that the initiator knows the
> capabilities of the peer, but rather that the initiator knows that the
> peer supports this specific mechanism.
> Also, is this "a DDMAP TLV"?
> Section 4.2
> Reading this makes it sound like the "LAG Description Indicator flag" is
> completely orthogonal to regular DDMAP functionality, so that if I set
> the LAG description indicator flag but do not otherwise request detailed
> descriptions, only the LAG interfaces are returned.  However, the flag
> in question has to be inside a DDMAP TLV (and we should really say so,
> perhaps as "[flag] set *in the DDMAP TLV*"!), so it seems like in practice
> the LAG information can only be obtained in conjunction with full
> detailed description information.  (The specific suggestion here is to
> note clearly that the flag is se in the DDMAP TLV.)
>        *  The responder LSR MUST include a DDMAP TLV when sending the
>           MPLS echo reply.
> I got confused on my first pass through this section; adding "There is a
> single DDMAP TLV for the LAG interface, with member links described
> using sub-TLVs" here would have kept me from veering onto the wrong path.
> (I thought there was going to be one DDMAP TLV per member link, plus one
> for the LAG aggregate, with lots of duplication.)
>     Based on the procedures described above, every LAG member link will
>     have a Local Interface Index Sub-TLV and a Multipath Data Sub-TLV
>     entries in the DDMAP TLV.  The order of the Sub-TLVs in the DDMAP TLV
>     for a LAG member link MUST be Local Interface Index Sub-TLV
>     immediately followed by Multipath Data Sub-TLV.  A LAG member link
>     MAY also have a corresponding Remote Interface Index Sub-TLV.  When a
>     Local Interface Index Sub-TLV, a Remote Interface Index-Sub-TLV and a
>     Multipath Data Sub-TLV are placed in the DDMAP TLV to describe a LAG
>     member link, they MUST be placed in the order of Local Interface
>     Index Sub-TLV, Remote Interface Index-Sub-TLV and Multipath Data Sub-
>     TLV.
> This last MUST contradicts the previous MUST.  I suggest some more text
> to clarify under which conditions each requirement applies.
> I'd also suggest not using the figure for conveying a (apparent?) mandatory
> requirement, and instead adding some text at the end: "The block of local
> interface index, (optional remote interface index) and multipath data
> sub-TLVs for each member link MUST appear adjacent to each other in
> order of increasing local interface index."
> It's confusing to label the multiplath data sub-TLV as "MANDATORY" when
> the body text says "MUST add an [sic] Multipath data Sub-TLV [...] if
> the received DDMAP TLV requested multipath information", i.e., it is not
> always present, based on the contents of the echo request.
>     When none of the received multipath information maps to a particular
>     LAG member link, then the responder LSR MUST still place the Local
>     Interface Index Sub-TLV and the Multipath Data Sub-TLV for that LAG
>     member link in the DDMAP TLV, the value of Multipath Length field of
>     the Multipath Data Sub-TLV is set to zero.
> nit: the last comma is a comma splice.
> Section 5.1.2
> It seems like this places a slightly stronger burden on the responder
> than stock 8029 does, in that there is a MUST-level requirement to act
> on the 'I' flag in combination with the 'G' flag, whereas for the 'I'
> flag alone 8029 only has a SHOULD-level requirement to respond
> accordingly.  We may want to call this out as a change in behavior.
> Section 5.1.3
>     Expectation is that there's a relationship between the interface
>     index of the outgoing LAG member link at TTL=n and the interface
>     index of the incoming LAG member link at TTL=n+1 for all discovered
>     entropies.  In other words, set of entropies that load balances to
> nit: "The expectation"
> nit: "discovered entropies" is a strange wording given how the entropy
> label works; is this more like "all entropies examined"?
>     The initiator LSR sends two MPLS echo request messages to traverse
>     the two LAG member links at TTL=n+1:
> How does the initiator know (which entropy label values?) to get the echo
> requests to traverse different member links?
>     o  Error case:
>        *  One MPLS echo request message reaches TTL=n+1 on an LAG member
>           link 1.
>        *  The other MPLS echo request message also reaches TTL=n+1 on an
>           LAG member link 1.
>        One or two MPLS echo request messages sent by the initiator LSR
>        cannot reach the immediate downstream LSR, or the two MPLS echo
>        request messages reach at the immediate downstream LSR from the
>        same LAG member link.
> The description paragraph doesn't seem to match up the scenario
> described in the sub-bullets, which leaves me confused as to what
> scenario(s) are intended to be considered as the "error case".
> Section 6
>     This document defines a new optional TLV which is referred to as the
>     "LSR Capability TLV.  [...]
> Is this optional to use or comprehension-optional?  (We elsewhere rely
> on comprehension-mandatory behavior, so in the RFC 8029 terminology this
> seems to be a "mandatory TLV" even if it is not mandatory to use.)
>     included in the MPLS echo reply message.  Otherwise, if the responder
>     does not know the LSR Capability TLV, an MPLS echo reply with the
>     return code set to "One or more of the TLVs was not understood" MUST
>     be sent back to the initiator LSR.
> This duplicates a requirement stated elsewhere (that really ought to
> just be using existing required behavior from 8029 anyway).
>     LSR Capability TLV Type is TBD1.  Length is 4.  The value field of
> In a hypothetical future where we need to allocate a 33rd capability
> flag, would we rather create a new "extended capabilities" TLV (burning
> another 32 bits for type+length) or leave flexibility now for 'length'
> to increase in multiples of 4 with receivers ignoring bits past what
> they know about?
>        The "LSR Capability Flags" field is 4 octets in length, this
>        document defines the following flags:
>        0                   1                   2                   3
>        0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>       |                  Must Be Zero (Reserved)                  |U|D|
>       +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> I'm wary of describing these as "Must Be Zero", since they're just
> unallocated but not required to be zero other than by the current state
> of bit allocations.  (I guess this is conventional for MPLS, though?  So
> maybe it's not a problem.)  Just "Reserved" would be fine.
>        This document defines two flags.  The remaining flags MUST be set
>        to zero when sending and ignored on receipt.  Both the U and the D
>        flag MUST be cleared in the MPLS echo request message when
>        sending, and ignored on receipt.  Neither, either or both the U
>        and the D flag MAY be set in the MPLS echo reply message.
> Similarly, I think we want to say "unallocated flags" rather than "the
> remaining flags".
> It also seems like the semantics being described here are "the TLV value
> is ignored in the echo request and only has meaning in the response"
> (i.e., this is a "tell me your capabilities" exchange, not a "here are
> mine; what are yours?" exchange).  As written, that applies only to the
> U and D bits, and is not a general property.  That may well be a fine
> choice if we think we might want the two-way exchange for some future
> allocated bit, but if we do think it's a general property of the
> mechanism, I'd suggest rewording the text.
> Section 7
> (Per IANA, bits 4 and 5 are already assigned but are not reflected in the
> diagram.)
> Section 10
>     The Detailed Interface and Label Stack TLV format is derived from the
>     Interface and Label Stack TLV format (from [RFC8029]).  Two changes
>     are introduced.  The first is that the label stack is converted into
>     a sub-TLV.  The second is that a new sub-TLV is added to describe an
>     interface index.  These fields of Detailed Interface and Label Stack
>     TLV have the same use and meaning as in [RFC8029].  A summary of
>     these fields is as below:
> nit: Are "These fields" just "The other fields"?
>           The Address Type indicates if the interface is numbered or
>           unnumbered.  It also determines the length of the IP Address
>           and Interface fields.  The resulting total length of the
>           initial part of the TLV is listed as "K Octets".  The Address
>           Type is set to one of the following values:
> nit: is it better to list the currently allocated values or just refer
> to the IANA registry?
> I seee that this "index assigned to the interface" language for
> unnumbered address types originates from RFC 8029, but both this
> document and 8029 leave me confused as to the encoding used for this
> index (for either/both IPv4 or IPv6 address types).
> [My comment about "Must Be Zero" and allocatable bits also applies here]
> Section 12
> This document also adds a capability "negotiation" mechanism for MPLS
> echo request/reply exchanges.  As MPLS does not offer integrity
> protection for its messages, this negotiation is only as trustworthy as
> the network from which messages are accepted as valid, and initiators
> have no recourse but to accept faithfully the echo replies received.
>     To prevent leakage of vital information to untrusted users, a
>     responder LSR MUST only accept MPLS echo request messages from
>     designated trusted sources via filtering source IP address field of
>     received MPLS echo request messages.  [...]
> I'd prefer to see a note added here about how "source IP address
> filtering provides only a weak form of access control and is not, in
> general, a reliable security mechanism.  Nonetheless, it is required
> here in the absence of any more robust mechanism that might be used."
> Section 13
> Should we ask IANA to update the "Interface and Label Stack Address
> Types" registry to also refer to this document, since we are sharing the
> registry between the Interface and Label Stack TLV and the Detailed
> version?
> Section 13.2
> The range 4-31743 spans two different allocation procedures (Standards
> Action and Specification Required - Experimental RFC needed); I assume
> that we want the 4-16383 range for Standards Action - mandatory TLVs.
> Section 13.4
> We don't say what range we want this to be allocated from -- I assume
> 4-16383 since this is a negotiated feature and sending it outside the
> negotiated scenario should be an error.
> Section 13.4.1
> (I note that some existing sub-TLV registries have "Experimental RFC
> required" for the experimental ranges, but that is arguably more
> stringent than necessary; Specification Required seems more appropriate
> for this case.)
> Appendix A
> I was kind of hoping for some more guidance on what an initiator could
> do in these scenarios to try to get better visibility into the switch
> behavior (and, hopefully, a workaround to get reliable echo responses
> that cover all LAG member links).  AFAICT there's not a better option
> than "try a bunch of entropy labels and see what responses you can get
> back" and that's the same remedy in all the described topologies, but it
> would be nice to have this stated explicitly for the reader.
> Section A.1
>                      In the worst case, MPLS echo request messages with
>     specific entropies to exercise every LAG members from R1 to S1 can
>     all reach R2 via same LAG member.  [...]
> I'm confused by the phrase "specific entropies to exercise every LAG
> members [sic] from R1 to S1" -- is there some deterministic algorithm by
> which a specific entropy label value will force a specific LAG member
> link?  My understanding was that the entropy was used as input to an
> opaque hash function and that trial-and-error was the only way to
> explore the mapping from entropy value to LAG member link taken.


Loa Andersson                        email:
Senior MPLS Expert
Bronze Dragon Consulting             phone: +46 739 81 21 64