[Idr] Benjamin Kaduk's Discuss on draft-ietf-idr-tunnel-encaps-20: (with DISCUSS and COMMENT)
Benjamin Kaduk via Datatracker <noreply@ietf.org> Tue, 01 December 2020 18:18 UTC
Return-Path: <noreply@ietf.org>
X-Original-To: idr@ietf.org
Delivered-To: idr@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id DF09C3A141E; Tue, 1 Dec 2020 10:18:14 -0800 (PST)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-idr-tunnel-encaps@ietf.org, idr-chairs@ietf.org, idr@ietf.org, John Scudder <jgs@juniper.net>, aretana.ietf@gmail.com, Susan Hares <shares@ndzh.com>, shares@ndzh.com
X-Test-IDTracker: no
X-IETF-IDTracker: 7.23.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <160684669488.21585.5180075052177708759@ietfa.amsl.com>
Date: Tue, 01 Dec 2020 10:18:14 -0800
Archived-At: <https://mailarchive.ietf.org/arch/msg/idr/yR0KfBKtw63gR5BPzOeApnFXfi8>
Subject: [Idr] Benjamin Kaduk's Discuss on draft-ietf-idr-tunnel-encaps-20: (with DISCUSS and COMMENT)
X-BeenThere: idr@ietf.org
X-Mailman-Version: 2.1.29
List-Id: Inter-Domain Routing <idr.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/idr>, <mailto:idr-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/idr/>
List-Post: <mailto:idr@ietf.org>
List-Help: <mailto:idr-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/idr>, <mailto:idr-request@ietf.org?subject=subscribe>
X-List-Received-Date: Tue, 01 Dec 2020 18:18:15 -0000
Benjamin Kaduk has entered the following ballot position for draft-ietf-idr-tunnel-encaps-20: Discuss When responding, please keep the subject line intact and reply to all email addresses included in the To and CC lines. (Feel free to cut this introductory paragraph, however.) Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html for more information about IESG DISCUSS and COMMENT positions. The document, along with other ballot positions, can be found here: https://datatracker.ietf.org/doc/draft-ietf-idr-tunnel-encaps/ ---------------------------------------------------------------------- DISCUSS: ---------------------------------------------------------------------- I support Erik's discuss. I see that Roman has already suggested adding normative language regarding the limitation to a single administrative domain (in addition to the "MUST filter by default for EBGP sessions"), which I agree with. However, I think there is an additional consideration regarding the limitation of use to a single administrative domain, wherein the domain of use for the tunnel encapsulation attribute may diverge from the domain of use of segment routing, that seems to place this document in conflict with the requirements of RFC 8402. In particular, RFC 8402 says, for SR-MPLS and SRv6, that boundary routers "MUST filter any external traffic", and additionally for SRv6 that "explicit routing information MUST NOT be leaked through the boundaries of the administrered domain". In §3.7 of this document, though, we find that for the Prefix-SID sub-TLV, "the receiving BGP speaker need not even be in the same Segment Routing Domain as the tunnel's egress endpoint, and there is no implication that the prefix-SID for the advertised prefix is the same in the Segment Routing domains of the BGP speaker that originated the sub-TLV and the BGP speaker that received it", which seems to suggest violation of the RFC 8402 requirement. I think we need to have greater clarity on what relationship is actually intended between the SR domain and the tunnel encapsulation usage domain, and if they are to diverge, we need to both somehow rectify this behavior with RFC 8402 and to very clearly document how the 8402-mandated filtering at the SR domain boundary is supposed to happen when the boundary includes tunneled traffic. I also would like to ensure that we have had adequate discussion of the relationship between this document and RFC 8365. The IESG has received comments recently (in the context of a different document) that it is irresponsible to effectively obsolete or deprecate existing work without identifying or explicitly updating such work, and without indicating whose responsibility it is to find discrepancies. That seems like it might apply to what's currently in Appendix A, which on first reading suggests "there might be a problem here, but we aren't saying exactly what or how to fix it, or even who is going to do that work". ---------------------------------------------------------------------- COMMENT: ---------------------------------------------------------------------- It's good to see that the shepherd writeup got updated as things changed; thank you for keeping it up to date! [I initially wrote some inline comments about handling internal inconsistencies within a given tunnel specification as malformed and ignoring the tunnel entirely vs specifying a precedence order (the latter being what this document does). I removed them, because this seems to be a generic topic where security types tend to fail-closed and routing types tend to aim to provide some kind of service when possible, and I don't have anything new to add to the discussion for these particular cases.] I didn't see a response to the secdir review; it would be good to get a response to that, in particular to hear what amount of consideration has been given to what new ways this provides to attack BGP. Abstract of certain other SAFIs. This document adds support for additional Tunnel Types, and allows a remote tunnel endpoint address to be specified for each tunnel. This document also provides support for The shepherd writeup suggests that the "remote tunnel endpoint" terminology was switched to be "tunnel egress endpoint"; was this spot missed? Section 1.4 o Defining a new "Tunnel Egress Endpoint sub-TLV" (Section 3.1) that can be included in any of the TLVs contained in the Tunnel Encapsulation attribute. This sub-TLV can be used to specify the remote endpoint address of a particular tunnel. ["remote endpoint" again] Section 3.1 I agree with Martin V that there must be a story about this Reserved field and why it's only SHOULD send as zero. I don't know whether this information needs to end up in the RFC but I think we should talk about why it is this way. In particular, the current requirements suggest that it could be (mis?)used as an additional data channel by collaborating implementations (that ignore the "MUST be disregarded"), without actually writing up a specification for those semantics. If the Address Family subfield has any value other than IPv4 or IPv6, the Tunnel Egress Endpoint sub-TLV is considered "unrecognized" (see We probably need to repeat the carve-out for the value 0 here, as well. (I dithered about remarking about the earlier "assumes that the Address Family is either IPv4 or IPv6" since the "one special case" is a few paragraphs later.) o It can be determined that the IP address in the sub-TLV's address subfield does not belong to the Autonomous System (AS) that originated the route that contains the attribute. Section 3.1.1 describes an optional procedure to make this determination. This check seems highly important for the security of the system and should get called out in the security considerations. Section 3.1.1 sub-TLV is considered not to be valid. In some cases a network operator who controls a set of Autonomous Systems might wish to allow a Tunnel Egress Endpoint to reside in an AS other than Route_AS; configuration MAY allow for such a case, in which case the check becomes, if Egress_AS is not within the configured set of permitted AS numbers, then the Tunnel Egress Endpoint sub-TLV is considered to be "malformed". (nit?) maybe "the configured set of permitted AS numbers that contains Route_AS"? The current wording implies that there can only be one such configured set and that it is used regardless of Route_AS, which does not seem right... Section 3.2 This section defines Encapsulation sub-TLVs for the following tunnel types: VXLAN ([RFC7348]), NVGRE ([RFC7637]), MPLS-in-GRE ([RFC4023]), L2TPv3 ([RFC3931]), and GRE ([RFC2784]). Thanks for putting the links all in one place, here. I, at least, would have benefited from also putting the links/references in the corresponding sections, but that is probably just a matter of style. Section 3.2.1 R: The remaining bits in the 8-bit flags field are reserved for further use. They MUST always be set to 0 by the originator of the sub-TLV. Intermediate routers MUST propagate them without modification. Any receiving routers MUST ignore these bits upon a receipt of the sub-TLV. nit: spurious "a" in "upon a receipt" (and diffing this section against §3.2.2 it seems that maybe the "of the sub-TLV" is also superfluous?). o If the V bit is not set, and the BGP UPDATE message has AFI/SAFI other than Ethernet VPNs (EVPN) then the VXLAN tunnel cannot be used. If this is intended to refer to SAFI 70 (from RFC 7432), I note that the IANA entry is named "BGP EVPNs". [I also don't understand why it's okay for EVPN to not have a VN-ID when using the VXLAN tunnel, but assume that's just my ignorance.] Section 3.2.2 Reserved (two fields): MUST be set to zero on transmission and disregarded on receipt. (nit) I only see one field marked "Reserved" (this format is the same layout as for VXLAN). o The values of the V, M, and R bits are NOT copied into the flags field of the NVGRE header. The flags field of the VXLAN header is set as per [RFC7637]. (nit) stray "VXLAN"? Section 3.2.5 While it is not really necessary to have both the GRE and MPLS-in-GRE tunnel types, both are included for reasons of backwards compatibility. It might be nice to have a few more words about which one is the "backward compatible" option and what it's compatible with. Section 3.3 If an outer Encapsulation sub-TLV occurs in a TLV for a Tunnel Type that does not use the corresponding outer encapsulation, the sub-TLV MUST be treated as if it were an unknown type of sub-TLV. nit: this is the only instance of "unknown" in the document; using "unrecognized" seems to be the common case (and makes it easier to find Section 13). Section 3.3.1 I think we may need to discuss the semantics of the DS field here -- as I understand it, the attribute advertised in this TLV is the DSCP value that the BGP speaker would like to receive in traffic destined to the tunnel egress endpoint (which may be a different node than the BGP speaker itself, but is expected to be under the control of the same administrative entity). Additionally, the interpretation of DSCP values is subject to local interpretation on a given network. Since the tunnel encapsulation attribute is transitive, it will be propagated potentially across multiple BGP hops and multiple ASes, so that the tunnel ingress endpoint is not necessarily adjacent to the tunnel egress endpoint. Although we do say that we expect the tunnel encapsulation information to only be propagated within an administrative boundary, there is no guarantee that the administrative boundary in question uses a unified DSCP handling procedure throughout it. As such, it may be possible to end up in a regime where the requested DSCP codepoint has a different, and potentially hazardous, interpretation, at the ingress of the tunnel. So, it seems that we need to say something about either local policy for DSCP value filtering, or only using this value when "directly" connected to the egress AS, or similar; we do have something like this already for the TC portion of the MPLS label stack entries. Section 3.5 labeled address family, then the sub-TLV MUST be disregarded. If the sub-TLV is contained in a TLV whose Tunnel Type does not have a virtual network identifier in its encapsulation header, the sub-TLV MUST be disregarded. In those cases where the sub-TLV is ignored, it SHOULD NOT be stripped from the TLV before the route is propagated. Why only SHOULD NOT here? I thought we hat MUST-level requirements to preserve things unchaged in similar situations. Section 4.1 In the remainder of this specification, when a route is referred to as containing a Tunnel Encapsulation attribute with a TLV identifying a particular Tunnel Type, it implicitly includes the case where the route contains a Tunnel Encapsulation Extended Community identifying that Tunnel Type. I searched the entire document for the string "identifying" and did not find any instances where a route was referred to as containing a Tunenl Encapsulation attribute with a TLV identifying a particular tunnel type. Perhaps I should be looking for the "attribute" keyword, but there are over 200 instances of that string; could you confirm whether this implicit inclusion is actually used anywhere (and if so, give an example of such usage)? Section 6 [RFC5512] specifies the use of the Tunnel Encapsulation attribute in BGP UPDATE messages of AFI/SAFI 1/7 and 2/7. That document restricts the use of this attribute to UPDATE messages of those SAFIs. This document removes that restriction. I believe another reviewer commented on the ambiguity of "that", which I first thought referred to "this" vs "that"; I now see that there is additional ambituity as to whether it is the SAFI restriction or the UPDATE message restriction that is lifted, and suggest clarification of that as well. Once it is determined to send a packet through the tunnel specified in a particular Tunnel TLV of a particular Tunnel Encapsulation attribute, then the tunnel's egress endpoint address is the IP address contained in the sub-TLV. If the Tunnel TLV contains a nit: I think we have to say "Tunnel Egress Endpoint sub-TLV"; the use of the definite article is not justified by the preceding context. Section 8 However, suppose that one of the TLVs in U2's Tunnel Encapsulation attribute contains the Color Sub-TLV. In that case, packet P MUST NOT be sent through the tunnel contained in that TLV, unless U1 is carrying the Color Extended Community that is identified in U2's Color Sub-TLV. We should probably reword this in light of Section 13's discussion that a given Tunnel TLV can have more than one Color sub-TLV present. Section 9.2 Three of the tunnel types that can be specified in a Tunnel Encapsulation TLV have virtual network identifier fields in their encapsulation headers. In the VXLAN encapsulation, this field is called the VNI (Virtual Network Identifier) field; in the NVGRE encapsulation, this field is called the VSID (Virtual Subnet Identifier) field. We start off by saying "three" types but list only two. What's the third type? Section 9.2.2.2 If the TLV identifying the tunnel does not contain an Encapsulation sub-TLV whose V bit is set, the virtual network identifier field of the encapsulation header is set as follows: This perhaps sets us up for a nasty surprise in light of the error-handling behavior in §13, where we are supposed to disregard the second and subsequent instances of the Encapsulation sub-TLV. This language is not particularly clear about whether it applies to only the first sub-TLV or all instances. o If the TLV does not contain an Embedded Label Handling sub-TLV, or if it contains an Embedded Label Handling sub-TLV whose value is 2, the embedded label is copied into the lower 3 octets of the virtual network identifier field of the encapsulation header. nit: I think "lower" is unneeded, since all the VNI fields are exactly 3 octets now. Section 10 Note that if the Tunnel Encapsulation attribute is attached to a VPN- IP route [RFC4364], and if Inter-AS "option b" (see section 10 of [RFC4364]) is being used, and if the Tunnel Egress Endpoint sub-TLV contains an IP address that is not in same AS as the router receiving the route, it is very likely that the embedded label has been changed. [...] I'm not sure that I'm understanding this scenario properly. The label has been "changed" with respect to what baseline? I'm also not sure why the tunnel egress endpoint would need to be in the same AS as the router receiving (vs originating) the route. Other documents may define other ways to signal tunnel information in BGP. For example, [RFC6514] defines the "P-Multicast Service Interface Tunnel" (PMSI Tunnel) attribute. In this specification, we do not consider the effects of advertising the Tunnel Encapsulation Attribute in conjunction with other forms of signaling tunnels. Any document specifying such joint use should provide details as to how interactions should be handled. It seems like we should perhaps go a step further and explicitly recommend not advertising such combinations in the absence of a specification for their combined use? Otherwise implementations will have to come up with their own interpretations, which could easily be uninteroperable. Section 13 The following sub-TLVs defined in this document MUST NOT occur more than once in a given Tunnel TLV: Tunnel Egress Endpoint (discussed below), Encapsulation, DS, UDP Destination Port, Embedded Label Handling, MPLS Label Stack, Prefix-SID. [...] Ah, thanks for listing these out. I had been wondering about this situation earlier in the doc, and it would have helped if the "not more than once" limitation was mentioned at each sub-TLV's definition (even if the actual error handling stays here). Section 14.3 Why do we only move "BGP Tunnel Encapsulation Attribute Sub-TLVs" (but not "BGP Tunnel Encapsulation Attribute Tunnel Types") to the new "BGP Tunnel Encapsulation Parameters" grouping? Section 14.9 It might be useful to indicate in the registry metadata how many flags are available (and, I suppose, in which order the bits are numbered). Section 15 We briefly discuss in the main body text the possibility that a tunnel will direct encapsulated traffic with (e.g) MPLS labels to a node that will misinterpret those labels; it might be worth reiterating the risk of such misinterpretation in the security considerations as well (or just referencing the previous discussion as security-relevant). I guess there's also a theoretical possibility that the flexibility in tunnel specification (including the type of expected content) could facilitate cross-protocol attacks, where the attacker causes the sender and recipient of encapsulated traffic to think that they should interpret the records in question as different protocols. But this seems so remote, and unlikely to succeed given different protocols' message structure, that it is probably not worth mentioning. Section 18.2 If we're using RFC 5462 as a reference for a field in an MPLS label, that seems to make it normative. We seem to depend on procedures from RFC 6811 in a few places, which also seems to make it normative.
- [Idr] Benjamin Kaduk's Discuss on draft-ietf-idr-… Benjamin Kaduk via Datatracker
- Re: [Idr] Benjamin Kaduk's Discuss on draft-ietf-… Susan Hares
- Re: [Idr] Benjamin Kaduk's Discuss on draft-ietf-… Benjamin Kaduk
- Re: [Idr] Benjamin Kaduk's Discuss on draft-ietf-… Susan Hares
- Re: [Idr] Benjamin Kaduk's Discuss on draft-ietf-… Alvaro Retana
- Re: [Idr] Benjamin Kaduk's Discuss on draft-ietf-… Benjamin Kaduk
- Re: [Idr] Benjamin Kaduk's Discuss on draft-ietf-… Alvaro Retana
- Re: [Idr] Benjamin Kaduk's Discuss on draft-ietf-… Susan Hares
- Re: [Idr] Benjamin Kaduk's Discuss on draft-ietf-… John Scudder
- Re: [Idr] Benjamin Kaduk's Discuss on draft-ietf-… Benjamin Kaduk
- Re: [Idr] Benjamin Kaduk's Discuss on draft-ietf-… John Scudder
- Re: [Idr] Benjamin Kaduk's Discuss on draft-ietf-… Alvaro Retana
- Re: [Idr] Benjamin Kaduk's Discuss on draft-ietf-… Alvaro Retana
- Re: [Idr] Benjamin Kaduk's Discuss on draft-ietf-… Benjamin Kaduk