Re: [mpls] Benjamin Kaduk's Discuss on draft-ietf-mpls-lsp-ping-lag-multipath-07: (with DISCUSS and COMMENT)

Mach Chen <mach.chen@huawei.com> Thu, 04 April 2019 11:24 UTC

Return-Path: <mach.chen@huawei.com>
X-Original-To: mpls@ietfa.amsl.com
Delivered-To: mpls@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id DC9BA120495; Thu, 4 Apr 2019 04:24:30 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.2
X-Spam-Level:
X-Spam-Status: No, score=-4.2 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ItV5hA4al-2R; Thu, 4 Apr 2019 04:24:27 -0700 (PDT)
Received: from huawei.com (lhrrgout.huawei.com [185.176.76.210]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 3F0B11206A9; Thu, 4 Apr 2019 04:24:27 -0700 (PDT)
Received: from LHREML714-CAH.china.huawei.com (unknown [172.18.7.107]) by Forcepoint Email with ESMTP id 2A6A6955F03197551250; Thu, 4 Apr 2019 12:24:25 +0100 (IST)
Received: from DGGEML422-HUB.china.huawei.com (10.1.199.39) by LHREML714-CAH.china.huawei.com (10.201.108.37) with Microsoft SMTP Server (TLS) id 14.3.408.0; Thu, 4 Apr 2019 12:24:24 +0100
Received: from DGGEML510-MBX.china.huawei.com ([169.254.2.113]) by dggeml422-hub.china.huawei.com ([10.1.199.39]) with mapi id 14.03.0415.000; Thu, 4 Apr 2019 19:24:22 +0800
From: Mach Chen <mach.chen@huawei.com>
To: Benjamin Kaduk <kaduk@mit.edu>, The IESG <iesg@ietf.org>
CC: "draft-ietf-mpls-lsp-ping-lag-multipath@ietf.org" <draft-ietf-mpls-lsp-ping-lag-multipath@ietf.org>, "mpls-chairs@ietf.org" <mpls-chairs@ietf.org>, "loa@pi.nu" <loa@pi.nu>, "mpls@ietf.org" <mpls@ietf.org>
Thread-Topic: Benjamin Kaduk's Discuss on draft-ietf-mpls-lsp-ping-lag-multipath-07: (with DISCUSS and COMMENT)
Thread-Index: AQHU6jioOlipL0w4g0qDJ1tOkPDTlKYrPATg
Date: Thu, 04 Apr 2019 11:24:22 +0000
Message-ID: <F73A3CB31E8BE34FA1BBE3C8F0CB2AE292995741@dggeml510-mbx.china.huawei.com>
References: <155430820919.22676.3438359950422110370.idtracker@ietfa.amsl.com>
In-Reply-To: <155430820919.22676.3438359950422110370.idtracker@ietfa.amsl.com>
Accept-Language: en-US, zh-CN
Content-Language: zh-CN
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.111.194.201]
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: base64
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Archived-At: <https://mailarchive.ietf.org/arch/msg/mpls/30BWJGrngEIrFIf1v--zEJpv29A>
Subject: Re: [mpls] Benjamin Kaduk's Discuss on draft-ietf-mpls-lsp-ping-lag-multipath-07: (with DISCUSS and COMMENT)
X-BeenThere: mpls@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Multi-Protocol Label Switching WG <mpls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mpls>, <mailto:mpls-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/mpls/>
List-Post: <mailto:mpls@ietf.org>
List-Help: <mailto:mpls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mpls>, <mailto:mpls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 04 Apr 2019 11:24:37 -0000

Hi Benjamin,

Thanks for your prompt response!

Please see my replies inline...

> -----Original Message-----
> From: Benjamin Kaduk via Datatracker [mailto:noreply@ietf.org]
> Sent: Thursday, April 04, 2019 12:17 AM
> To: The IESG <iesg@ietf.org>
> Cc: draft-ietf-mpls-lsp-ping-lag-multipath@ietf.org; mpls-chairs@ietf.org;
> loa@pi.nu; mpls@ietf.org
> Subject: Benjamin Kaduk's Discuss on draft-ietf-mpls-lsp-ping-lag-multipath-
> 07: (with DISCUSS and COMMENT)
> 
> Benjamin Kaduk has entered the following ballot position for
> draft-ietf-mpls-lsp-ping-lag-multipath-07: Discuss
> 
> When responding, please keep the subject line intact and reply to all email
> addresses included in the To and CC lines. (Feel free to cut this introductory
> paragraph, however.)
> 
> 
> Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
> for more information about IESG DISCUSS and COMMENT positions.
> 
> 
> The document, along with other ballot positions, can be found here:
> https://datatracker.ietf.org/doc/draft-ietf-mpls-lsp-ping-lag-multipath/
> 
> 
> 
> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
> 
> Thanks for the updates in the -07; we seem to be in agreement on the main
> path forward.  That said, in Section 4.2 we still see that:
> 
>    Based on the procedures described above, every LAG member link will
>    have a Local Interface Index Sub-TLV and a Multipath Data Sub-TLV
>    entries in the DDMAP TLV.  The order of the Sub-TLVs in the DDMAP TLV
>    for a LAG member link MUST be Local Interface Index Sub-TLV
>    immediately followed by Multipath Data Sub-TLV.  A LAG member link
>    MAY also have a corresponding Remote Interface Index Sub-TLV.  When a
>    [...]
> 
> I think we need "except as follows" or similar at the end of the second
> sentence, since otherwise we go on to have a MUST that contradicts the
> "MUST be Local Interface Index [...] immediately followed by Multipath
> Data".

I am fine to add "except as follows".

> 
> (Also, we missed one instance of "optional" in Section 8: "Local Interface
> Index Sub-TLV is an optional TLV")
> 
> 

Regarding the following comments, most of them addressed in version 07, a few fixed in version 08.


> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------
> 
> A couple nits in the new text in the -07: "an new" (should be "a new") and
> "mechanims" (should be "mechanism").

Fixed in version 08.

> 
> [the following COMMENT portion has not been revised since the ballot on
> the -06 and may contain stale information]
> 
> Am I reading this correctly that the "LSR Capability TLV" created here is
> basically only applicable to properties of MPLS echo request/reply exchanges?
> If so, then perhaps the moniker "LSR Capability" is overly generic.

Although the "LSR Capability TLV" is currently used for LSP Ping, it actually can be applicable for more generic purpose. I incline to keep it as-is. 

> 
> Section 1.2
> 
>    o  Label switching over some member links of the LAG is successful,
>       but will be failed over other member links of the LAG.
> 
> nit: there's some verb tense inconsistency here; maybe "but fails over other
> member links" would help.

Seems it has been addressed in version 07.

> 
> Section 2
> 
>    This document defines an optional TLV to discover the capabilities of
>    a responder LSR and extensions for use with the MPLS LSP Ping and
> 
> nit: is it only the *responder* LSR we care about, or are there potentially
> intermediaries in play?

Only the responder LSR we care about.

> 
>    The solution consists of the MPLS echo request containing a DDMAP TLV
>    and the optional LSR capability TLV to indicate that separate load
> 
> nit: Is this "optional capability TLV" the same "optional TLV" that we are
> defining in this document?  If so, calling it "the new optional LSR capability"
> would aid clarity.  

(Also, we seem to use both the "capability" and
> "Capability" capitzliations for describing this TLV.)

Only one left, fixed in version 08.

> 
> Section 3
> 
>    Capability TLV in the MPLS echo request message.  When the responder
>    LSR receives an MPLS echo request message with the LSR Capability TLV
>    included, if it knows the LSR Capability TLV, then it MUST include
>    the LSR Capability TLV in the MPLS echo reply message with the LSR
>    Capability TLV describing features and extensions supported by the
>    local LSR.  Otherwise, an MPLS echo reply MUST be sent back to the
>    initiator LSR with the return code set to "One or more of the TLVs
>    was not understood".  [...]
> 
> This last MUST seems like something that this document cannot mandate (as
> it attempts to apply to non-implementations of this document); it could only
> work if it was an existing requirement from the core MPLS spec, in which
> case lower-case "must" is appropriate (possibly with section reference).

This is addressed in version 07 by following proposal:

"Otherwise, an MPLS echo reply must be sent back to the initiator LSR with the return code set to "One or more of the TLVs was not understood", according to the rules as defined Section 3 of RFC8029."

Is this OK for you?

> 
>    o  If the responder LSR does not understand the "LAG Description
>       Indicator flag":
> 
>       *  Clear both the "Downstream LAG Info Accommodation flag" and the
>          "Upstream LAG Info Accommodation flag".
> 
> Similarly, if an LSR does not understand a flag, it cannot give special
> treatment to packets containing that flag.  (Why an LSR would not
> understand a flag defined in the same document that defines the Capabilities
> TLV is beyond me, but you seem to want to allow for that
> case.)

This has been fixed in version 07 by removing the above bullet.

> 
>    If the responder does not know the LSR Capability TLV, an MPLS echo
>    reply with the return code set to "One or more of the TLVs was not
>    understood" MUST be sent back to the initiator LSR.
> 
> This duplicates the content I quoted at the top of this section's comments.

The duplicate text has been removed in version 07.

> 
> Section 4.1
> 
>    Once the initiator LSR knows the capabilities that a responder
>    supports, then it sends an MPLS echo request carrying a DDMAP with
>    the "LAG Description Indicator flag" (G) set to the responder LSR.
> 
> nit: I think the key point is not that the initiator knows the capabilities of the
> peer, but rather that the initiator knows that the peer supports this specific
> mechanism.
> Also, is this "a DDMAP TLV"?

Yes, fixed in version 07, with the following update.

"Once the initiator LSR knows that a responder can support this mechanism, then it sends an MPLS echo request carrying a DDMAP TLV with  the "LAG Description Indicator flag" (G) set to the responder LSR."

Is this OK for you?


> Also, is this "a DDMAP TLV"?
> 
> Section 4.2
> 
> Reading this makes it sound like the "LAG Description Indicator flag" is
> completely orthogonal to regular DDMAP functionality, so that if I set the
> LAG description indicator flag but do not otherwise request detailed
> descriptions, only the LAG interfaces are returned.  However, the flag in
> question has to be inside a DDMAP TLV (and we should really say so, perhaps
> as "[flag] set *in the DDMAP TLV*"!), so it seems like in practice the LAG
> information can only be obtained in conjunction with full detailed description
> information.  (The specific suggestion here is to note clearly that the flag is se
> in the DDMAP TLV.)

Fixed in version 07 as you suggested.


> 
>       *  The responder LSR MUST include a DDMAP TLV when sending the
>          MPLS echo reply.
> 
> I got confused on my first pass through this section; adding "There is a single
> DDMAP TLV for the LAG interface, with member links described using sub-
> TLVs" here would have kept me from veering onto the wrong path.
> (I thought there was going to be one DDMAP TLV per member link, plus one
> for the LAG aggregate, with lots of duplication.)

Fixed in version 07 as you suggested.

> 
>    Based on the procedures described above, every LAG member link will
>    have a Local Interface Index Sub-TLV and a Multipath Data Sub-TLV
>    entries in the DDMAP TLV.  The order of the Sub-TLVs in the DDMAP TLV
>    for a LAG member link MUST be Local Interface Index Sub-TLV
>    immediately followed by Multipath Data Sub-TLV.  A LAG member link
>    MAY also have a corresponding Remote Interface Index Sub-TLV.  When a
>    Local Interface Index Sub-TLV, a Remote Interface Index-Sub-TLV and a
>    Multipath Data Sub-TLV are placed in the DDMAP TLV to describe a LAG
>    member link, they MUST be placed in the order of Local Interface
>    Index Sub-TLV, Remote Interface Index-Sub-TLV and Multipath Data Sub-
>    TLV.
> 
> This last MUST contradicts the previous MUST.  I suggest some more text to
> clarify under which conditions each requirement applies.

Clarification text added in version 07, and add "except as follows" at the end of second sentence in version 08.


> 
> I'd also suggest not using the figure for conveying a (apparent?) mandatory
> requirement, and instead adding some text at the end: "The block of local
> interface index, (optional remote interface index) and multipath data sub-
> TLVs for each member link MUST appear adjacent to each other in order of
> increasing local interface index."

Added the text as you suggested in version 07.

> 
> It's confusing to label the multiplath data sub-TLV as "MANDATORY" when
> the body text says "MUST add an [sic] Multipath data Sub-TLV [...] if the
> received DDMAP TLV requested multipath information", i.e., it is not always
> present, based on the contents of the echo request.

Since it's just an example here, will remove the MANDATORY and OPTIONAL.

Fixed in version 07.

> 
>    When none of the received multipath information maps to a particular
>    LAG member link, then the responder LSR MUST still place the Local
>    Interface Index Sub-TLV and the Multipath Data Sub-TLV for that LAG
>    member link in the DDMAP TLV, the value of Multipath Length field of
>    the Multipath Data Sub-TLV is set to zero.
> 
> nit: the last comma is a comma splice.

Fixed in version 07.

> 
> Section 5.1.2
> 
> It seems like this places a slightly stronger burden on the responder than
> stock 8029 does, in that there is a MUST-level requirement to act on the 'I'
> flag in combination with the 'G' flag, whereas for the 'I'
> flag alone 8029 only has a SHOULD-level requirement to respond accordingly.
> We may want to call this out as a change in behavior.

Changed to "SHOULD", then it can align with RFC8029.

Fixed in version 07.

> 
> Section 5.1.3
> 
>    Expectation is that there's a relationship between the interface
>    index of the outgoing LAG member link at TTL=n and the interface
>    index of the incoming LAG member link at TTL=n+1 for all discovered
>    entropies.  In other words, set of entropies that load balances to
> 
> nit: "The expectation"
> nit: "discovered entropies" is a strange wording given how the entropy label
> works; is this more like "all entropies examined"?

Fixed in version 07.

> 
> 
>    The initiator LSR sends two MPLS echo request messages to traverse
>    the two LAG member links at TTL=n+1:
> 
> How does the initiator know (which entropy label values?) to get the echo
> requests to traverse different member links?

The initiator cannot exactly know which entropy label values are workable. Presumably, the initiator can set different values according to the number of member links, then the responder can use entropy values for hash to traverse different member link. This is left to implementation.

> 
>    o  Error case:
> 
>       *  One MPLS echo request message reaches TTL=n+1 on an LAG member
>          link 1.
> 
>       *  The other MPLS echo request message also reaches TTL=n+1 on an
>          LAG member link 1.
> 
>       One or two MPLS echo request messages sent by the initiator LSR
>       cannot reach the immediate downstream LSR, or the two MPLS echo
>       request messages reach at the immediate downstream LSR from the
>       same LAG member link.
> 
> The description paragraph doesn't seem to match up the scenario described
> in the sub-bullets, which leaves me confused as to what
> scenario(s) are intended to be considered as the "error case".

The sub-bullets are not complete, will add a third sub-bullet as below:

"One or both MPLS echo request messages cannot reach the immediate downstream LSR on whichever link."

Fixed in version 07.

> 
> Section 6
> 
>    This document defines a new optional TLV which is referred to as the
>    "LSR Capability TLV.  [...]
> 
> Is this optional to use or comprehension-optional?  (We elsewhere rely on
> comprehension-mandatory behavior, so in the RFC 8029 terminology this
> seems to be a "mandatory TLV" even if it is not mandatory to use.)

See the reply to Alvaro.

Fixed in version 07.

> 
>    included in the MPLS echo reply message.  Otherwise, if the responder
>    does not know the LSR Capability TLV, an MPLS echo reply with the
>    return code set to "One or more of the TLVs was not understood" MUST
>    be sent back to the initiator LSR.
> 
> This duplicates a requirement stated elsewhere (that really ought to just be
> using existing required behavior from 8029 anyway).

Duplicate text removed in version 07.

> 
>    LSR Capability TLV Type is TBD1.  Length is 4.  The value field of
> 
> In a hypothetical future where we need to allocate a 33rd capability flag,
> would we rather create a new "extended capabilities" TLV (burning another
> 32 bits for type+length) or leave flexibility now for 'length'
> to increase in multiples of 4 with receivers ignoring bits past what they know
> about?

IMHO, 32 bit seems reasonable enough for me. I incline to keep the flag field as is.

> 
>       The "LSR Capability Flags" field is 4 octets in length, this
>       document defines the following flags:
> 
>       0                   1                   2                   3
>       0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
>      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
>      |                  Must Be Zero (Reserved)                  |U|D|
>      +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
> 
> I'm wary of describing these as "Must Be Zero", since they're just unallocated
> but not required to be zero other than by the current state of bit allocations.
> (I guess this is conventional for MPLS, though?  So maybe it's not a problem.)
> Just "Reserved" would be fine.

Changed to reserved, fixed in version 07.

> 
>       This document defines two flags.  The remaining flags MUST be set
>       to zero when sending and ignored on receipt.  Both the U and the D
>       flag MUST be cleared in the MPLS echo request message when
>       sending, and ignored on receipt.  Neither, either or both the U
>       and the D flag MAY be set in the MPLS echo reply message.
> 
> Similarly, I think we want to say "unallocated flags" rather than "the
> remaining flags".

Fixed in version 07.

> 
> It also seems like the semantics being described here are "the TLV value is
> ignored in the echo request and only has meaning in the response"
> (i.e., this is a "tell me your capabilities" exchange, not a "here are mine; what
> are yours?" exchange).  As written, that applies only to the U and D bits, and
> is not a general property.  That may well be a fine choice if we think we might
> want the two-way exchange for some future allocated bit, but if we do think
> it's a general property of the mechanism, I'd suggest rewording the text.

It's designed as an unidirectional tool, make it a two-way exchange mechanism will require more considerations and substantial changes to draft.  So, I inline to keep it as-is. If we have such use case in the future, we update this mechanism then.

> 
> Section 7
> 
> (Per IANA, bits 4 and 5 are already assigned but are not reflected in the
> diagram.)

Fixed in version 07.

> 
> Section 10
> 
>    The Detailed Interface and Label Stack TLV format is derived from the
>    Interface and Label Stack TLV format (from [RFC8029]).  Two changes
>    are introduced.  The first is that the label stack is converted into
>    a sub-TLV.  The second is that a new sub-TLV is added to describe an
>    interface index.  These fields of Detailed Interface and Label Stack
>    TLV have the same use and meaning as in [RFC8029].  A summary of
>    these fields is as below:
> 
> nit: Are "These fields" just "The other fields"?

Fixed in version 07.

> 
>          The Address Type indicates if the interface is numbered or
>          unnumbered.  It also determines the length of the IP Address
>          and Interface fields.  The resulting total length of the
>          initial part of the TLV is listed as "K Octets".  The Address
>          Type is set to one of the following values:
> 
> nit: is it better to list the currently allocated values or just refer to the IANA
> registry?

Since here it also include the length field. I'd like to keep it both here and the IANA registry.

> 
> I seee that this "index assigned to the interface" language for unnumbered
> address types originates from RFC 8029, but both this document and 8029
> leave me confused as to the encoding used for this index (for either/both
> IPv4 or IPv6 address types).

It's normally an unsigned integer.

Fixed in version 07.

> 
> [My comment about "Must Be Zero" and allocatable bits also applies here]

Fixed in version 07..

> 
> Section 12
> 
> This document also adds a capability "negotiation" mechanism for MPLS echo
> request/reply exchanges.  As MPLS does not offer integrity protection for its
> messages, this negotiation is only as trustworthy as the network from which
> messages are accepted as valid, and initiators have no recourse but to accept
> faithfully the echo replies received.
> 
>    To prevent leakage of vital information to untrusted users, a
>    responder LSR MUST only accept MPLS echo request messages from
>    designated trusted sources via filtering source IP address field of
>    received MPLS echo request messages.  [...]
> 
> I'd prefer to see a note added here about how "source IP address filtering
> provides only a weak form of access control and is not, in general, a reliable
> security mechanism.  Nonetheless, it is required here in the absence of any
> more robust mechanism that might be used."

Added in version 07.

> 
> Section 13
> 
> Should we ask IANA to update the "Interface and Label Stack Address Types"
> registry to also refer to this document, since we are sharing the registry
> between the Interface and Label Stack TLV and the Detailed version?
> 

Added in version 07.


> Section 13.2
> 
> The range 4-31743 spans two different allocation procedures (Standards
> Action and Specification Required - Experimental RFC needed); I assume that
> we want the 4-16383 range for Standards Action - mandatory TLVs.

Yes.

> 
> Section 13.4
> 
> We don't say what range we want this to be allocated from -- I assume
> 4-16383 since this is a negotiated feature and sending it outside the
> negotiated scenario should be an error.

Suggested range added in version 07.

> 
> Section 13.4.1
> 
> (I note that some existing sub-TLV registries have "Experimental RFC
> required" for the experimental ranges, but that is arguably more stringent
> than necessary; Specification Required seems more appropriate for this case.)

Changed to " Specification Required" in version 07.

> 
> Appendix A
> 
> I was kind of hoping for some more guidance on what an initiator could do in
> these scenarios to try to get better visibility into the switch behavior (and,
> hopefully, a workaround to get reliable echo responses that cover all LAG
> member links).  AFAICT there's not a better option than "try a bunch of
> entropy labels and see what responses you can get back" and that's the
> same remedy in all the described topologies, but it would be nice to have this
> stated explicitly for the reader.

A note added in version 07.

> 
> Section A.1
> 
>                     In the worst case, MPLS echo request messages with
>    specific entropies to exercise every LAG members from R1 to S1 can
>    all reach R2 via same LAG member.  [...]
> 
> I'm confused by the phrase "specific entropies to exercise every LAG
> members [sic] from R1 to S1" -- is there some deterministic algorithm by
> which a specific entropy label value will force a specific LAG member link?

No.

> My understanding was that the entropy was used as input to an opaque hash
> function and that trial-and-error was the only way to explore the mapping
> from entropy value to LAG member link taken.

Yes.

Thanks,
Mach
>