Re: [Idr] Benjamin Kaduk's Discuss on draft-ietf-idr-tunnel-encaps-20: (with DISCUSS and COMMENT)

Benjamin Kaduk <kaduk@mit.edu> Tue, 01 December 2020 19:51 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: idr@ietfa.amsl.com
Delivered-To: idr@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 39AD73A149B; Tue, 1 Dec 2020 11:51:21 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.919
X-Spam-Level:
X-Spam-Status: No, score=-1.919 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_MSPIKE_H4=-0.01, RCVD_IN_MSPIKE_WL=-0.01, SPF_HELO_NONE=0.001, SPF_PASS=-0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id qnI39yphmw0C; Tue, 1 Dec 2020 11:51:19 -0800 (PST)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 96B6C3A1072; Tue, 1 Dec 2020 11:51:18 -0800 (PST)
Received: from kduck.mit.edu ([24.16.140.251]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 0B1JpBit027915 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Tue, 1 Dec 2020 14:51:15 -0500
Date: Tue, 1 Dec 2020 11:51:10 -0800
From: Benjamin Kaduk <kaduk@mit.edu>
To: Susan Hares <shares@ndzh.com>
Cc: "'The IESG'" <iesg@ietf.org>, idr@ietf.org, idr-chairs@ietf.org, draft-ietf-idr-tunnel-encaps@ietf.org
Message-ID: <20201201195110.GS34187@kduck.mit.edu>
References: <160684669488.21585.5180075052177708759@ietfa.amsl.com> <016401d6c81a$61b49da0$251dd8e0$@ndzh.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1
Content-Disposition: inline
Content-Transfer-Encoding: quoted-printable
In-Reply-To: <016401d6c81a$61b49da0$251dd8e0$@ndzh.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/03ZRiDNVjFF0OvEaj55cmRTlf9Y>
Subject: Re: [Idr] Benjamin Kaduk's Discuss on draft-ietf-idr-tunnel-encaps-20: (with DISCUSS and COMMENT)
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 01 Dec 2020 19:51:21 -0000

Hi Sue,

Thanks for asking.

I don't really have a concrete definition for "adequate", and am hoping
that the rest of the IESG can help determine that.
My primary goal here is to call the IESG's attention to the earlier
comments we had received on the other document and the question of whether
those comments apply to the matter at hand.  I expect that we will talk
about it on the telechat on thursday and that that will result in either me
clearing this discuss point or a set of concrete actionable criteria that
need to be met.

Does that help?

Thanks,

Ben

On Tue, Dec 01, 2020 at 02:44:26PM -0500, Susan Hares wrote:
> Benjamin: 
> 
> Since the IESG has engaged in a previous discussion regarding two documents, would let me know what you think is an "adequate" discussion.  Are you looking for a joint discussion with the two working groups (IDR and BESS) on the mail list plus a written report?  Or something else? 
> 
> Thank you, Sue  
> 
> -----Original Message-----
> From: Idr [mailto:idr-bounces@ietf.org] On Behalf Of Benjamin Kaduk via Datatracker
> Sent: Tuesday, December 1, 2020 1:18 PM
> To: The IESG
> Cc: idr@ietf.org; idr-chairs@ietf.org; draft-ietf-idr-tunnel-encaps@ietf.org; shares@ndzh.com
> Subject: [Idr] Benjamin Kaduk's Discuss on draft-ietf-idr-tunnel-encaps-20: (with DISCUSS and COMMENT)
> 
> Benjamin Kaduk has entered the following ballot position for
> draft-ietf-idr-tunnel-encaps-20: Discuss
> 
> When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.)
> 
> 
> Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
> for more information about IESG DISCUSS and COMMENT positions.
> 
> 
> The document, along with other ballot positions, can be found here:
> https://datatracker.ietf.org/doc/draft-ietf-idr-tunnel-encaps/
> 
> 
> 
> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
> 
> I support Erik's discuss.
> 
> I see that Roman has already suggested adding normative language regarding the limitation to a single administrative domain (in addition to the "MUST filter by default for EBGP sessions"), which I agree with.
> However, I think there is an additional consideration regarding the limitation of use to a single administrative domain, wherein the domain of use for the tunnel encapsulation attribute may diverge from the domain of use of segment routing, that seems to place this document in conflict with the requirements of RFC 8402.  In particular, RFC 8402 says, for SR-MPLS and SRv6, that boundary routers "MUST filter any external traffic", and additionally for SRv6 that "explicit routing information MUST NOT be leaked through the boundaries of the administrered domain".  In §3.7 of this document, though, we find that for the Prefix-SID sub-TLV, "the receiving BGP speaker need not even be in the same Segment Routing Domain as the tunnel's egress endpoint, and there is no implication that the prefix-SID for the advertised prefix is the same in the Segment Routing domains of the BGP speaker that originated the sub-TLV and the BGP speaker that received it", which seems to suggest violation of the RFC 8402 requirement.  I think we need to have greater clarity on what relationship is actually intended between the SR domain and the tunnel encapsulation usage domain, and if they are to diverge, we need to both somehow rectify this behavior with RFC 8402 and to very clearly document how the 8402-mandated filtering at the SR domain boundary is supposed to happen when the boundary includes tunneled traffic.
> 
> I also would like to ensure that we have had adequate discussion of the relationship between this document and RFC 8365.  The IESG has received comments recently (in the context of a different document) that it is irresponsible to effectively obsolete or deprecate existing work without identifying or explicitly updating such work, and without indicating whose responsibility it is to find discrepancies.  That seems like it might apply to what's currently in Appendix A, which on first reading suggests "there might be a problem here, but we aren't saying exactly what or how to fix it, or even who is going to do that work".
> 
> 
> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------
> 
> It's good to see that the shepherd writeup got updated as things changed; thank you for keeping it up to date!
> 
> [I initially wrote some inline comments about handling internal inconsistencies within a given tunnel specification as malformed and ignoring the tunnel entirely vs specifying a precedence order (the latter being what this document does).  I removed them, because this seems to be a generic topic where security types tend to fail-closed and routing types tend to aim to provide some kind of service when possible, and I don't have anything new to add to the discussion for these particular cases.]
> 
> I didn't see a response to the secdir review; it would be good to get a response to that, in particular to hear what amount of consideration has been given to what new ways this provides to attack BGP.
> 
> Abstract
> 
>    of certain other SAFIs.  This document adds support for additional
>    Tunnel Types, and allows a remote tunnel endpoint address to be
>    specified for each tunnel.  This document also provides support for
> 
> The shepherd writeup suggests that the "remote tunnel endpoint"
> terminology was switched to be "tunnel egress endpoint"; was this spot missed?
> 
> Section 1.4
> 
>    o  Defining a new "Tunnel Egress Endpoint sub-TLV" (Section 3.1) that
>       can be included in any of the TLVs contained in the Tunnel
>       Encapsulation attribute.  This sub-TLV can be used to specify the
>       remote endpoint address of a particular tunnel.
> 
> ["remote endpoint" again]
> 
> Section 3.1
> 
> I agree with Martin V that there must be a story about this Reserved field and why it's only SHOULD send as zero.  I don't know whether this information needs to end up in the RFC but I think we should talk about why it is this way.  In particular, the current requirements suggest that it could be (mis?)used as an additional data channel by collaborating implementations (that ignore the "MUST be disregarded"), without actually writing up a specification for those semantics.
> 
>    If the Address Family subfield has any value other than IPv4 or IPv6,
>    the Tunnel Egress Endpoint sub-TLV is considered "unrecognized" (see
> 
> We probably need to repeat the carve-out for the value 0 here, as well.
> (I dithered about remarking about the earlier "assumes that the Address Family is either IPv4 or IPv6" since the "one special case" is a few paragraphs later.)
> 
>    o  It can be determined that the IP address in the sub-TLV's address
>       subfield does not belong to the Autonomous System (AS) that
>       originated the route that contains the attribute.  Section 3.1.1
>       describes an optional procedure to make this determination.
> 
> This check seems highly important for the security of the system and should get called out in the security considerations.
> 
> Section 3.1.1
> 
>    sub-TLV is considered not to be valid.  In some cases a network
>    operator who controls a set of Autonomous Systems might wish to allow
>    a Tunnel Egress Endpoint to reside in an AS other than Route_AS;
>    configuration MAY allow for such a case, in which case the check
>    becomes, if Egress_AS is not within the configured set of permitted
>    AS numbers, then the Tunnel Egress Endpoint sub-TLV is considered to
>    be "malformed".
> 
> (nit?) maybe "the configured set of permitted AS numbers that contains Route_AS"?  The current wording implies that there can only be one such configured set and that it is used regardless of Route_AS, which does not seem right...
> 
> Section 3.2
> 
>    This section defines Encapsulation sub-TLVs for the following tunnel
>    types: VXLAN ([RFC7348]), NVGRE ([RFC7637]), MPLS-in-GRE ([RFC4023]),
>    L2TPv3 ([RFC3931]), and GRE ([RFC2784]).
> 
> Thanks for putting the links all in one place, here.  I, at least, would have benefited from also putting the links/references in the corresponding sections, but that is probably just a matter of style.
> 
> Section 3.2.1
> 
>       R: The remaining bits in the 8-bit flags field are reserved for
>       further use.  They MUST always be set to 0 by the originator of
>       the sub-TLV.  Intermediate routers MUST propagate them without
>       modification.  Any receiving routers MUST ignore these bits upon a
>       receipt of the sub-TLV.
> 
> nit: spurious "a" in "upon a receipt" (and diffing this section against
> §3.2.2 it seems that maybe the "of the sub-TLV" is also superfluous?).
> 
>    o  If the V bit is not set, and the BGP UPDATE message has AFI/SAFI
>       other than Ethernet VPNs (EVPN) then the VXLAN tunnel cannot be
>       used.
> 
> If this is intended to refer to SAFI 70 (from RFC 7432), I note that the IANA entry is named "BGP EVPNs".
> 
> [I also don't understand why it's okay for EVPN to not have a VN-ID when using the VXLAN tunnel, but assume that's just my ignorance.]
> 
> Section 3.2.2
> 
>       Reserved (two fields): MUST be set to zero on transmission and
>       disregarded on receipt.
> 
> (nit) I only see one field marked "Reserved" (this format is the same layout as for VXLAN).
> 
>    o  The values of the V, M, and R bits are NOT copied into the flags
>       field of the NVGRE header.  The flags field of the VXLAN header is
>       set as per [RFC7637].
> 
> (nit) stray "VXLAN"?
> 
> Section 3.2.5
> 
>    While it is not really necessary to have both the GRE and MPLS-in-GRE
>    tunnel types, both are included for reasons of backwards
>    compatibility.
> 
> It might be nice to have a few more words about which one is the "backward compatible" option and what it's compatible with.
> 
> Section 3.3
> 
>    If an outer Encapsulation sub-TLV occurs in a TLV for a Tunnel Type
>    that does not use the corresponding outer encapsulation, the sub-TLV
>    MUST be treated as if it were an unknown type of sub-TLV.
> 
> nit: this is the only instance of "unknown" in the document; using "unrecognized" seems to be the common case (and makes it easier to find Section 13).
> 
> Section 3.3.1
> 
> I think we may need to discuss the semantics of the DS field here -- as I understand it, the attribute advertised in this TLV is the DSCP value that the BGP speaker would like to receive in traffic destined to the tunnel egress endpoint (which may be a different node than the BGP speaker itself, but is expected to be under the control of the same administrative entity).  Additionally, the interpretation of DSCP values is subject to local interpretation on a given network.  Since the tunnel encapsulation attribute is transitive, it will be propagated potentially across multiple BGP hops and multiple ASes, so that the tunnel ingress endpoint is not necessarily adjacent to the tunnel egress endpoint.
> Although we do say that we expect the tunnel encapsulation information to only be propagated within an administrative boundary, there is no guarantee that the administrative boundary in question uses a unified DSCP handling procedure throughout it.  As such, it may be possible to end up in a regime where the requested DSCP codepoint has a different, and potentially hazardous, interpretation, at the ingress of the tunnel.
> So, it seems that we need to say something about either local policy for DSCP value filtering, or only using this value when "directly" connected to the egress AS, or similar; we do have something like this already for the TC portion of the MPLS label stack entries.
> 
> Section 3.5
> 
>    labeled address family, then the sub-TLV MUST be disregarded.  If the
>    sub-TLV is contained in a TLV whose Tunnel Type does not have a
>    virtual network identifier in its encapsulation header, the sub-TLV
>    MUST be disregarded.  In those cases where the sub-TLV is ignored, it
>    SHOULD NOT be stripped from the TLV before the route is propagated.
> 
> Why only SHOULD NOT here?  I thought we hat MUST-level requirements to preserve things unchaged in similar situations.
> 
> Section 4.1
> 
>    In the remainder of this specification, when a route is referred to
>    as containing a Tunnel Encapsulation attribute with a TLV identifying
>    a particular Tunnel Type, it implicitly includes the case where the
>    route contains a Tunnel Encapsulation Extended Community identifying
>    that Tunnel Type.
> 
> I searched the entire document for the string "identifying" and did not find any instances where a route was referred to as containing a Tunenl Encapsulation attribute with a TLV identifying a particular tunnel type.
> Perhaps I should be looking for the "attribute" keyword, but there are over 200 instances of that string; could you confirm whether this implicit inclusion is actually used anywhere (and if so, give an example of such usage)?
> 
> Section 6
> 
>    [RFC5512] specifies the use of the Tunnel Encapsulation attribute in
>    BGP UPDATE messages of AFI/SAFI 1/7 and 2/7.  That document restricts
>    the use of this attribute to UPDATE messages of those SAFIs.  This
>    document removes that restriction.
> 
> I believe another reviewer commented on the ambiguity of "that", which I first thought referred to "this" vs "that"; I now see that there is additional ambituity as to whether it is the SAFI restriction or the UPDATE message restriction that is lifted, and suggest clarification of that as well.
> 
>    Once it is determined to send a packet through the tunnel specified
>    in a particular Tunnel TLV of a particular Tunnel Encapsulation
>    attribute, then the tunnel's egress endpoint address is the IP
>    address contained in the sub-TLV.  If the Tunnel TLV contains a
> 
> nit: I think we have to say "Tunnel Egress Endpoint sub-TLV"; the use of the definite article is not justified by the preceding context.
> 
> Section 8
> 
>    However, suppose that one of the TLVs in U2's Tunnel Encapsulation
>    attribute contains the Color Sub-TLV.  In that case, packet P MUST
>    NOT be sent through the tunnel contained in that TLV, unless U1 is
>    carrying the Color Extended Community that is identified in U2's
>    Color Sub-TLV.
> 
> We should probably reword this in light of Section 13's discussion that a given Tunnel TLV can have more than one Color sub-TLV present.
> 
> Section 9.2
> 
>    Three of the tunnel types that can be specified in a Tunnel
>    Encapsulation TLV have virtual network identifier fields in their
>    encapsulation headers.  In the VXLAN encapsulation, this field is
>    called the VNI (Virtual Network Identifier) field; in the NVGRE
>    encapsulation, this field is called the VSID (Virtual Subnet
>    Identifier) field.
> 
> We start off by saying "three" types but list only two.  What's the third type?
> 
> Section 9.2.2.2
> 
>    If the TLV identifying the tunnel does not contain an Encapsulation
>    sub-TLV whose V bit is set, the virtual network identifier field of
>    the encapsulation header is set as follows:
> 
> This perhaps sets us up for a nasty surprise in light of the error-handling behavior in §13, where we are supposed to disregard the second and subsequent instances of the Encapsulation sub-TLV.  This language is not particularly clear about whether it applies to only the first sub-TLV or all instances.
> 
>    o  If the TLV does not contain an Embedded Label Handling sub-TLV, or
>       if it contains an Embedded Label Handling sub-TLV whose value is
>       2, the embedded label is copied into the lower 3 octets of the
>       virtual network identifier field of the encapsulation header.
> 
> nit: I think "lower" is unneeded, since all the VNI fields are exactly 3 octets now.
> 
> Section 10
> 
>    Note that if the Tunnel Encapsulation attribute is attached to a VPN-
>    IP route [RFC4364], and if Inter-AS "option b" (see section 10 of
>    [RFC4364]) is being used, and if the Tunnel Egress Endpoint sub-TLV
>    contains an IP address that is not in same AS as the router receiving
>    the route, it is very likely that the embedded label has been
>    changed.  [...]
> 
> I'm not sure that I'm understanding this scenario properly.  The label has been "changed" with respect to what baseline?  I'm also not sure why the tunnel egress endpoint would need to be in the same AS as the router receiving (vs originating) the route.
> 
>    Other documents may define other ways to signal tunnel information in
>    BGP.  For example, [RFC6514] defines the "P-Multicast Service
>    Interface Tunnel" (PMSI Tunnel) attribute.  In this specification, we
>    do not consider the effects of advertising the Tunnel Encapsulation
>    Attribute in conjunction with other forms of signaling tunnels.  Any
>    document specifying such joint use should provide details as to how
>    interactions should be handled.
> 
> It seems like we should perhaps go a step further and explicitly recommend not advertising such combinations in the absence of a specification for their combined use?  Otherwise implementations will have to come up with their own interpretations, which could easily be uninteroperable.
> 
> Section 13
> 
>    The following sub-TLVs defined in this document MUST NOT occur more
>    than once in a given Tunnel TLV: Tunnel Egress Endpoint (discussed
>    below), Encapsulation, DS, UDP Destination Port, Embedded Label
>    Handling, MPLS Label Stack, Prefix-SID.  [...]
> 
> Ah, thanks for listing these out.  I had been wondering about this situation earlier in the doc, and it would have helped if the "not more than once" limitation was mentioned at each sub-TLV's definition (even if the actual error handling stays here).
> 
> Section 14.3
> 
> Why do we only move "BGP Tunnel Encapsulation Attribute Sub-TLVs" (but not "BGP Tunnel Encapsulation Attribute Tunnel Types") to the new "BGP Tunnel Encapsulation Parameters" grouping?
> 
> Section 14.9
> 
> It might be useful to indicate in the registry metadata how many flags are available (and, I suppose, in which order the bits are numbered).
> 
> Section 15
> 
> We briefly discuss in the main body text the possibility that a tunnel will direct encapsulated traffic with (e.g) MPLS labels to a node that will misinterpret those labels; it might be worth reiterating the risk of such misinterpretation in the security considerations as well (or just referencing the previous discussion as security-relevant).
> 
> I guess there's also a theoretical possibility that the flexibility in tunnel specification (including the type of expected content) could facilitate cross-protocol attacks, where the attacker causes the sender and recipient of encapsulated traffic to think that they should interpret the records in question as different protocols.  But this seems so remote, and unlikely to succeed given different protocols'
> message structure, that it is probably not worth mentioning.
> 
> Section 18.2
> 
> If we're using RFC 5462 as a reference for a field in an MPLS label, that seems to make it normative.
> 
> We seem to depend on procedures from RFC 6811 in a few places, which also seems to make it normative.
> 
> 
> 
> _______________________________________________
> Idr mailing list
> Idr@ietf.org
> https://www.ietf.org/mailman/listinfo/idr
>