[mpls] Benjamin Kaduk's Discuss on draft-ietf-mpls-lsp-ping-lag-multipath-06: (with DISCUSS and COMMENT)
Benjamin Kaduk via Datatracker <noreply@ietf.org> Tue, 12 March 2019 20:30 UTC
Return-Path: <noreply@ietf.org>
X-Original-To: mpls@ietf.org
Delivered-To: mpls@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id 02911131304; Tue, 12 Mar 2019 13:30:31 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-mpls-lsp-ping-lag-multipath@ietf.org, mpls-chairs@ietf.org, loa@pi.nu, mpls@ietf.org
X-Test-IDTracker: no
X-IETF-IDTracker: 6.93.0
Auto-Submitted: auto-generated
Precedence: bulk
Message-ID: <155242263093.20169.3838924661112115545.idtracker@ietfa.amsl.com>
Date: Tue, 12 Mar 2019 13:30:30 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/mpls/e_4XfWvf6zLA-iqC-R8GtblPFa4>
Subject: [mpls] Benjamin Kaduk's Discuss on draft-ietf-mpls-lsp-ping-lag-multipath-06: (with DISCUSS and COMMENT)
X-BeenThere: mpls@ietf.org
X-Mailman-Version: 2.1.29
List-Id: Multi-Protocol Label Switching WG <mpls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mpls>, <mailto:mpls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mpls/>
List-Post: <mailto:mpls@ietf.org>
List-Help: <mailto:mpls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mpls>, <mailto:mpls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 12 Mar 2019 20:30:36 -0000
Benjamin Kaduk has entered the following ballot position for draft-ietf-mpls-lsp-ping-lag-multipath-06: Discuss When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.) Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html for more information about IESG DISCUSS and COMMENT positions. The document, along with other ballot positions, can be found here: https://datatracker.ietf.org/doc/draft-ietf-mpls-lsp-ping-lag-multipath/ ---------------------------------------------------------------------- DISCUSS: ---------------------------------------------------------------------- Thanks for this document; it's clearly going to provide value, and it gives a pretty well-readable description of how things are expected to work. That said, there's a number of details that need to be worked out before publication... This document has six listed authors; per RFC 7322, documents listing more than five authors are unusual, and six is greater than five. Let's talk about the author count. "optional TLV" seems to be used in MPLS contexts as a technical term for "comprehension-optional", not in the "optional to send" sense; it's the counterpart of "mandatory TLV", and this terminology is even documented in the MPLS TLVs registry. It's my understanding that the LSP Capabilities TLV and the Detailed Interface and Label Stack TLV are intended to be comprehension-required (i.e., "mandatory TLVs"), and thus that we must not use the phrase "optional TLV" in their description. This would be made clear if we listed which range of values we intended to allocate a TLV type from, in the IANA considerations, but that remains unspecified at this time. In a similar vein (the "comprehension-required" behavior), there are a few places (Section 3 (twice), and Section 6; also Section 3 for the LAG Description Indicator flag; see COMMENT) where we state new normative language ("MUST") that is unenforceable, since it would need to apply to MPLS implementations that do not implement this specification. Fortunately, the comprehension-required TLV ranges provide this functionality for us without the need to use normative language. Section 4.2 has some conflicting "MUST"s about the ordering/presence of sub-TLVs in the DDMAP TLV (see COMMENT, and also the "MANDATORY" langauge in Figure 2). If I'm reading Section 5.1.2 correctly, the described procedure is only intended to apply when the "G" flag is present (as well as the "I" flag, the requirement for which is explicitly stated), but this is not explicitly stated. In particular, the text as written says it applies to all responders that "understand the LAG Description Indicator flag" with no mention of runtime check for the presence of that flag. I also worry that Sections 8, 9, and 10 are insufficiently clear about the encoding of the interface index value -- is it an integer in NBO, an opaque bitstring, or something else? ---------------------------------------------------------------------- COMMENT: ---------------------------------------------------------------------- Am I reading this correctly that the "LSR Capability TLV" created here is basically only applicable to properties of MPLS echo request/reply exchanges? If so, then perhaps the moniker "LSR Capability" is overly generic. Section 1.2 o Label switching over some member links of the LAG is successful, but will be failed over other member links of the LAG. nit: there's some verb tense inconsistency here; maybe "but fails over other member links" would help. Section 2 This document defines an optional TLV to discover the capabilities of a responder LSR and extensions for use with the MPLS LSP Ping and nit: is it only the *responder* LSR we care about, or are there potentially intermediaries in play? The solution consists of the MPLS echo request containing a DDMAP TLV and the optional LSR capability TLV to indicate that separate load nit: Is this "optional capability TLV" the same "optional TLV" that we are defining in this document? If so, calling it "the new optional LSR capability" would aid clarity. (Also, we seem to use both the "capability" and "Capability" capitzliations for describing this TLV.) Section 3 Capability TLV in the MPLS echo request message. When the responder LSR receives an MPLS echo request message with the LSR Capability TLV included, if it knows the LSR Capability TLV, then it MUST include the LSR Capability TLV in the MPLS echo reply message with the LSR Capability TLV describing features and extensions supported by the local LSR. Otherwise, an MPLS echo reply MUST be sent back to the initiator LSR with the return code set to "One or more of the TLVs was not understood". [...] This last MUST seems like something that this document cannot mandate (as it attempts to apply to non-implementations of this document); it could only work if it was an existing requirement from the core MPLS spec, in which case lower-case "must" is appropriate (possibly with section reference). o If the responder LSR does not understand the "LAG Description Indicator flag": * Clear both the "Downstream LAG Info Accommodation flag" and the "Upstream LAG Info Accommodation flag". Similarly, if an LSR does not understand a flag, it cannot give special treatment to packets containing that flag. (Why an LSR would not understand a flag defined in the same document that defines the Capabilities TLV is beyond me, but you seem to want to allow for that case.) If the responder does not know the LSR Capability TLV, an MPLS echo reply with the return code set to "One or more of the TLVs was not understood" MUST be sent back to the initiator LSR. This duplicates the content I quoted at the top of this section's comments. Section 4.1 Once the initiator LSR knows the capabilities that a responder supports, then it sends an MPLS echo request carrying a DDMAP with the "LAG Description Indicator flag" (G) set to the responder LSR. nit: I think the key point is not that the initiator knows the capabilities of the peer, but rather that the initiator knows that the peer supports this specific mechanism. Also, is this "a DDMAP TLV"? Section 4.2 Reading this makes it sound like the "LAG Description Indicator flag" is completely orthogonal to regular DDMAP functionality, so that if I set the LAG description indicator flag but do not otherwise request detailed descriptions, only the LAG interfaces are returned. However, the flag in question has to be inside a DDMAP TLV (and we should really say so, perhaps as "[flag] set *in the DDMAP TLV*"!), so it seems like in practice the LAG information can only be obtained in conjunction with full detailed description information. (The specific suggestion here is to note clearly that the flag is se in the DDMAP TLV.) * The responder LSR MUST include a DDMAP TLV when sending the MPLS echo reply. I got confused on my first pass through this section; adding "There is a single DDMAP TLV for the LAG interface, with member links described using sub-TLVs" here would have kept me from veering onto the wrong path. (I thought there was going to be one DDMAP TLV per member link, plus one for the LAG aggregate, with lots of duplication.) Based on the procedures described above, every LAG member link will have a Local Interface Index Sub-TLV and a Multipath Data Sub-TLV entries in the DDMAP TLV. The order of the Sub-TLVs in the DDMAP TLV for a LAG member link MUST be Local Interface Index Sub-TLV immediately followed by Multipath Data Sub-TLV. A LAG member link MAY also have a corresponding Remote Interface Index Sub-TLV. When a Local Interface Index Sub-TLV, a Remote Interface Index-Sub-TLV and a Multipath Data Sub-TLV are placed in the DDMAP TLV to describe a LAG member link, they MUST be placed in the order of Local Interface Index Sub-TLV, Remote Interface Index-Sub-TLV and Multipath Data Sub- TLV. This last MUST contradicts the previous MUST. I suggest some more text to clarify under which conditions each requirement applies. I'd also suggest not using the figure for conveying a (apparent?) mandatory requirement, and instead adding some text at the end: "The block of local interface index, (optional remote interface index) and multipath data sub-TLVs for each member link MUST appear adjacent to each other in order of increasing local interface index." It's confusing to label the multiplath data sub-TLV as "MANDATORY" when the body text says "MUST add an [sic] Multipath data Sub-TLV [...] if the received DDMAP TLV requested multipath information", i.e., it is not always present, based on the contents of the echo request. When none of the received multipath information maps to a particular LAG member link, then the responder LSR MUST still place the Local Interface Index Sub-TLV and the Multipath Data Sub-TLV for that LAG member link in the DDMAP TLV, the value of Multipath Length field of the Multipath Data Sub-TLV is set to zero. nit: the last comma is a comma splice. Section 5.1.2 It seems like this places a slightly stronger burden on the responder than stock 8029 does, in that there is a MUST-level requirement to act on the 'I' flag in combination with the 'G' flag, whereas for the 'I' flag alone 8029 only has a SHOULD-level requirement to respond accordingly. We may want to call this out as a change in behavior. Section 5.1.3 Expectation is that there's a relationship between the interface index of the outgoing LAG member link at TTL=n and the interface index of the incoming LAG member link at TTL=n+1 for all discovered entropies. In other words, set of entropies that load balances to nit: "The expectation" nit: "discovered entropies" is a strange wording given how the entropy label works; is this more like "all entropies examined"? The initiator LSR sends two MPLS echo request messages to traverse the two LAG member links at TTL=n+1: How does the initiator know (which entropy label values?) to get the echo requests to traverse different member links? o Error case: * One MPLS echo request message reaches TTL=n+1 on an LAG member link 1. * The other MPLS echo request message also reaches TTL=n+1 on an LAG member link 1. One or two MPLS echo request messages sent by the initiator LSR cannot reach the immediate downstream LSR, or the two MPLS echo request messages reach at the immediate downstream LSR from the same LAG member link. The description paragraph doesn't seem to match up the scenario described in the sub-bullets, which leaves me confused as to what scenario(s) are intended to be considered as the "error case". Section 6 This document defines a new optional TLV which is referred to as the "LSR Capability TLV. [...] Is this optional to use or comprehension-optional? (We elsewhere rely on comprehension-mandatory behavior, so in the RFC 8029 terminology this seems to be a "mandatory TLV" even if it is not mandatory to use.) included in the MPLS echo reply message. Otherwise, if the responder does not know the LSR Capability TLV, an MPLS echo reply with the return code set to "One or more of the TLVs was not understood" MUST be sent back to the initiator LSR. This duplicates a requirement stated elsewhere (that really ought to just be using existing required behavior from 8029 anyway). LSR Capability TLV Type is TBD1. Length is 4. The value field of In a hypothetical future where we need to allocate a 33rd capability flag, would we rather create a new "extended capabilities" TLV (burning another 32 bits for type+length) or leave flexibility now for 'length' to increase in multiples of 4 with receivers ignoring bits past what they know about? The "LSR Capability Flags" field is 4 octets in length, this document defines the following flags: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Must Be Zero (Reserved) |U|D| +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ I'm wary of describing these as "Must Be Zero", since they're just unallocated but not required to be zero other than by the current state of bit allocations. (I guess this is conventional for MPLS, though? So maybe it's not a problem.) Just "Reserved" would be fine. This document defines two flags. The remaining flags MUST be set to zero when sending and ignored on receipt. Both the U and the D flag MUST be cleared in the MPLS echo request message when sending, and ignored on receipt. Neither, either or both the U and the D flag MAY be set in the MPLS echo reply message. Similarly, I think we want to say "unallocated flags" rather than "the remaining flags". It also seems like the semantics being described here are "the TLV value is ignored in the echo request and only has meaning in the response" (i.e., this is a "tell me your capabilities" exchange, not a "here are mine; what are yours?" exchange). As written, that applies only to the U and D bits, and is not a general property. That may well be a fine choice if we think we might want the two-way exchange for some future allocated bit, but if we do think it's a general property of the mechanism, I'd suggest rewording the text. Section 7 (Per IANA, bits 4 and 5 are already assigned but are not reflected in the diagram.) Section 10 The Detailed Interface and Label Stack TLV format is derived from the Interface and Label Stack TLV format (from [RFC8029]). Two changes are introduced. The first is that the label stack is converted into a sub-TLV. The second is that a new sub-TLV is added to describe an interface index. These fields of Detailed Interface and Label Stack TLV have the same use and meaning as in [RFC8029]. A summary of these fields is as below: nit: Are "These fields" just "The other fields"? The Address Type indicates if the interface is numbered or unnumbered. It also determines the length of the IP Address and Interface fields. The resulting total length of the initial part of the TLV is listed as "K Octets". The Address Type is set to one of the following values: nit: is it better to list the currently allocated values or just refer to the IANA registry? I seee that this "index assigned to the interface" language for unnumbered address types originates from RFC 8029, but both this document and 8029 leave me confused as to the encoding used for this index (for either/both IPv4 or IPv6 address types). [My comment about "Must Be Zero" and allocatable bits also applies here] Section 12 This document also adds a capability "negotiation" mechanism for MPLS echo request/reply exchanges. As MPLS does not offer integrity protection for its messages, this negotiation is only as trustworthy as the network from which messages are accepted as valid, and initiators have no recourse but to accept faithfully the echo replies received. To prevent leakage of vital information to untrusted users, a responder LSR MUST only accept MPLS echo request messages from designated trusted sources via filtering source IP address field of received MPLS echo request messages. [...] I'd prefer to see a note added here about how "source IP address filtering provides only a weak form of access control and is not, in general, a reliable security mechanism. Nonetheless, it is required here in the absence of any more robust mechanism that might be used." Section 13 Should we ask IANA to update the "Interface and Label Stack Address Types" registry to also refer to this document, since we are sharing the registry between the Interface and Label Stack TLV and the Detailed version? Section 13.2 The range 4-31743 spans two different allocation procedures (Standards Action and Specification Required - Experimental RFC needed); I assume that we want the 4-16383 range for Standards Action - mandatory TLVs. Section 13.4 We don't say what range we want this to be allocated from -- I assume 4-16383 since this is a negotiated feature and sending it outside the negotiated scenario should be an error. Section 13.4.1 (I note that some existing sub-TLV registries have "Experimental RFC required" for the experimental ranges, but that is arguably more stringent than necessary; Specification Required seems more appropriate for this case.) Appendix A I was kind of hoping for some more guidance on what an initiator could do in these scenarios to try to get better visibility into the switch behavior (and, hopefully, a workaround to get reliable echo responses that cover all LAG member links). AFAICT there's not a better option than "try a bunch of entropy labels and see what responses you can get back" and that's the same remedy in all the described topologies, but it would be nice to have this stated explicitly for the reader. Section A.1 In the worst case, MPLS echo request messages with specific entropies to exercise every LAG members from R1 to S1 can all reach R2 via same LAG member. [...] I'm confused by the phrase "specific entropies to exercise every LAG members [sic] from R1 to S1" -- is there some deterministic algorithm by which a specific entropy label value will force a specific LAG member link? My understanding was that the entropy was used as input to an opaque hash function and that trial-and-error was the only way to explore the mapping from entropy value to LAG member link taken.
- [mpls] Benjamin Kaduk's Discuss on draft-ietf-mpl… Benjamin Kaduk via Datatracker
- Re: [mpls] Benjamin Kaduk's Discuss on draft-ietf… Loa Andersson
- Re: [mpls] Benjamin Kaduk's Discuss on draft-ietf… Benjamin Kaduk
- Re: [mpls] Benjamin Kaduk's Discuss on draft-ietf… Mach Chen
- Re: [mpls] Benjamin Kaduk's Discuss on draft-ietf… Mach Chen
- Re: [mpls] Benjamin Kaduk's Discuss on draft-ietf… Benjamin Kaduk
- Re: [mpls] Benjamin Kaduk's Discuss on draft-ietf… Mach Chen