Re: [Idr] Benjamin Kaduk's Discuss on draft-ietf-idr-tunnel-encaps-20: (with DISCUSS and COMMENT)

Ben: 

Thank you for the quick response. I look forward to getting suggestions from
the IESG.  

The authors have taken care to try to carefully insert the new technology
into the deployed web of BGP technology.  I agree with the IESG stance that
it is critical to make sure we clearly state what other standards need to be
fixed when deprecate non-working technology and insert new technology. 

John, I and Alvaro discussed a number of options as he crafted the appendix
on RFC 8365.  It would be good to hear from the collective wisdom of the
Routing ADs and the IESG on this subject. 

Sue 

-----Original Message-----
From: Benjamin Kaduk [mailto:kaduk@mit.edu] 
Sent: Tuesday, December 1, 2020 2:51 PM
To: Susan Hares
Cc: 'The IESG'; idr@ietf.org; idr-chairs@ietf.org;
draft-ietf-idr-tunnel-encaps@ietf.org
Subject: Re: [Idr] Benjamin Kaduk's Discuss on
draft-ietf-idr-tunnel-encaps-20: (with DISCUSS and COMMENT)

Hi Sue,

Thanks for asking.

I don't really have a concrete definition for "adequate", and am hoping that
the rest of the IESG can help determine that.
My primary goal here is to call the IESG's attention to the earlier comments
we had received on the other document and the question of whether those
comments apply to the matter at hand.  I expect that we will talk about it
on the telechat on thursday and that that will result in either me clearing
this discuss point or a set of concrete actionable criteria that need to be
met.

Does that help?

Thanks,

Ben

On Tue, Dec 01, 2020 at 02:44:26PM -0500, Susan Hares wrote:
> Benjamin: 
> 
> Since the IESG has engaged in a previous discussion regarding two
documents, would let me know what you think is an "adequate" discussion.
Are you looking for a joint discussion with the two working groups (IDR and
BESS) on the mail list plus a written report?  Or something else? 
> 
> Thank you, Sue
> 
> -----Original Message-----
> From: Idr [mailto:idr-bounces@ietf.org] On Behalf Of Benjamin Kaduk 
> via Datatracker
> Sent: Tuesday, December 1, 2020 1:18 PM
> To: The IESG
> Cc: idr@ietf.org; idr-chairs@ietf.org; 
> draft-ietf-idr-tunnel-encaps@ietf.org; shares@ndzh.com
> Subject: [Idr] Benjamin Kaduk's Discuss on 
> draft-ietf-idr-tunnel-encaps-20: (with DISCUSS and COMMENT)
> 
> Benjamin Kaduk has entered the following ballot position for
> draft-ietf-idr-tunnel-encaps-20: Discuss
> 
> When responding, please keep the subject line intact and reply to all 
> email addresses included in the To and CC lines. (Feel free to cut 
> this introductory paragraph, however.)
> 
> 
> Please refer to 
> https://www.ietf.org/iesg/statement/discuss-criteria.html
> for more information about IESG DISCUSS and COMMENT positions.
> 
> 
> The document, along with other ballot positions, can be found here:
> https://datatracker.ietf.org/doc/draft-ietf-idr-tunnel-encaps/
> 
> 
> 
> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
> 
> I support Erik's discuss.
> 
> I see that Roman has already suggested adding normative language regarding
the limitation to a single administrative domain (in addition to the "MUST
filter by default for EBGP sessions"), which I agree with.
> However, I think there is an additional consideration regarding the
limitation of use to a single administrative domain, wherein the domain of
use for the tunnel encapsulation attribute may diverge from the domain of
use of segment routing, that seems to place this document in conflict with
the requirements of RFC 8402.  In particular, RFC 8402 says, for SR-MPLS and
SRv6, that boundary routers "MUST filter any external traffic", and
additionally for SRv6 that "explicit routing information MUST NOT be leaked
through the boundaries of the administrered domain".  In §3.7 of this
document, though, we find that for the Prefix-SID sub-TLV, "the receiving
BGP speaker need not even be in the same Segment Routing Domain as the
tunnel's egress endpoint, and there is no implication that the prefix-SID
for the advertised prefix is the same in the Segment Routing domains of the
BGP speaker that originated the sub-TLV and the BGP speaker that received
it", which seems to suggest violation of the RFC 8402 requirement.  I think
we need to have greater clarity on what relationship is actually intended
between the SR domain and the tunnel encapsulation usage domain, and if they
are to diverge, we need to both somehow rectify this behavior with RFC 8402
and to very clearly document how the 8402-mandated filtering at the SR
domain boundary is supposed to happen when the boundary includes tunneled
traffic.
> 
> I also would like to ensure that we have had adequate discussion of the
relationship between this document and RFC 8365.  The IESG has received
comments recently (in the context of a different document) that it is
irresponsible to effectively obsolete or deprecate existing work without
identifying or explicitly updating such work, and without indicating whose
responsibility it is to find discrepancies.  That seems like it might apply
to what's currently in Appendix A, which on first reading suggests "there
might be a problem here, but we aren't saying exactly what or how to fix it,
or even who is going to do that work".
> 
> 
> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------
> 
> It's good to see that the shepherd writeup got updated as things changed;
thank you for keeping it up to date!
> 
> [I initially wrote some inline comments about handling internal 
> inconsistencies within a given tunnel specification as malformed and 
> ignoring the tunnel entirely vs specifying a precedence order (the 
> latter being what this document does).  I removed them, because this 
> seems to be a generic topic where security types tend to fail-closed 
> and routing types tend to aim to provide some kind of service when 
> possible, and I don't have anything new to add to the discussion for 
> these particular cases.]
> 
> I didn't see a response to the secdir review; it would be good to get a
response to that, in particular to hear what amount of consideration has
been given to what new ways this provides to attack BGP.
> 
> Abstract
> 
>    of certain other SAFIs.  This document adds support for additional
>    Tunnel Types, and allows a remote tunnel endpoint address to be
>    specified for each tunnel.  This document also provides support for
> 
> The shepherd writeup suggests that the "remote tunnel endpoint"
> terminology was switched to be "tunnel egress endpoint"; was this spot
missed?
> 
> Section 1.4
> 
>    o  Defining a new "Tunnel Egress Endpoint sub-TLV" (Section 3.1) that
>       can be included in any of the TLVs contained in the Tunnel
>       Encapsulation attribute.  This sub-TLV can be used to specify the
>       remote endpoint address of a particular tunnel.
> 
> ["remote endpoint" again]
> 
> Section 3.1
> 
> I agree with Martin V that there must be a story about this Reserved field
and why it's only SHOULD send as zero.  I don't know whether this
information needs to end up in the RFC but I think we should talk about why
it is this way.  In particular, the current requirements suggest that it
could be (mis?)used as an additional data channel by collaborating
implementations (that ignore the "MUST be disregarded"), without actually
writing up a specification for those semantics.
> 
>    If the Address Family subfield has any value other than IPv4 or IPv6,
>    the Tunnel Egress Endpoint sub-TLV is considered "unrecognized" 
> (see
> 
> We probably need to repeat the carve-out for the value 0 here, as well.
> (I dithered about remarking about the earlier "assumes that the 
> Address Family is either IPv4 or IPv6" since the "one special case" is 
> a few paragraphs later.)
> 
>    o  It can be determined that the IP address in the sub-TLV's address
>       subfield does not belong to the Autonomous System (AS) that
>       originated the route that contains the attribute.  Section 3.1.1
>       describes an optional procedure to make this determination.
> 
> This check seems highly important for the security of the system and
should get called out in the security considerations.
> 
> Section 3.1.1
> 
>    sub-TLV is considered not to be valid.  In some cases a network
>    operator who controls a set of Autonomous Systems might wish to allow
>    a Tunnel Egress Endpoint to reside in an AS other than Route_AS;
>    configuration MAY allow for such a case, in which case the check
>    becomes, if Egress_AS is not within the configured set of permitted
>    AS numbers, then the Tunnel Egress Endpoint sub-TLV is considered to
>    be "malformed".
> 
> (nit?) maybe "the configured set of permitted AS numbers that contains
Route_AS"?  The current wording implies that there can only be one such
configured set and that it is used regardless of Route_AS, which does not
seem right...
> 
> Section 3.2
> 
>    This section defines Encapsulation sub-TLVs for the following tunnel
>    types: VXLAN ([RFC7348]), NVGRE ([RFC7637]), MPLS-in-GRE ([RFC4023]),
>    L2TPv3 ([RFC3931]), and GRE ([RFC2784]).
> 
> Thanks for putting the links all in one place, here.  I, at least, would
have benefited from also putting the links/references in the corresponding
sections, but that is probably just a matter of style.
> 
> Section 3.2.1
> 
>       R: The remaining bits in the 8-bit flags field are reserved for
>       further use.  They MUST always be set to 0 by the originator of
>       the sub-TLV.  Intermediate routers MUST propagate them without
>       modification.  Any receiving routers MUST ignore these bits upon a
>       receipt of the sub-TLV.
> 
> nit: spurious "a" in "upon a receipt" (and diffing this section 
> against
> §3.2.2 it seems that maybe the "of the sub-TLV" is also superfluous?).
> 
>    o  If the V bit is not set, and the BGP UPDATE message has AFI/SAFI
>       other than Ethernet VPNs (EVPN) then the VXLAN tunnel cannot be
>       used.
> 
> If this is intended to refer to SAFI 70 (from RFC 7432), I note that the
IANA entry is named "BGP EVPNs".
> 
> [I also don't understand why it's okay for EVPN to not have a VN-ID 
> when using the VXLAN tunnel, but assume that's just my ignorance.]
> 
> Section 3.2.2
> 
>       Reserved (two fields): MUST be set to zero on transmission and
>       disregarded on receipt.
> 
> (nit) I only see one field marked "Reserved" (this format is the same
layout as for VXLAN).
> 
>    o  The values of the V, M, and R bits are NOT copied into the flags
>       field of the NVGRE header.  The flags field of the VXLAN header is
>       set as per [RFC7637].
> 
> (nit) stray "VXLAN"?
> 
> Section 3.2.5
> 
>    While it is not really necessary to have both the GRE and MPLS-in-GRE
>    tunnel types, both are included for reasons of backwards
>    compatibility.
> 
> It might be nice to have a few more words about which one is the "backward
compatible" option and what it's compatible with.
> 
> Section 3.3
> 
>    If an outer Encapsulation sub-TLV occurs in a TLV for a Tunnel Type
>    that does not use the corresponding outer encapsulation, the sub-TLV
>    MUST be treated as if it were an unknown type of sub-TLV.
> 
> nit: this is the only instance of "unknown" in the document; using
"unrecognized" seems to be the common case (and makes it easier to find
Section 13).
> 
> Section 3.3.1
> 
> I think we may need to discuss the semantics of the DS field here -- as I
understand it, the attribute advertised in this TLV is the DSCP value that
the BGP speaker would like to receive in traffic destined to the tunnel
egress endpoint (which may be a different node than the BGP speaker itself,
but is expected to be under the control of the same administrative entity).
Additionally, the interpretation of DSCP values is subject to local
interpretation on a given network.  Since the tunnel encapsulation attribute
is transitive, it will be propagated potentially across multiple BGP hops
and multiple ASes, so that the tunnel ingress endpoint is not necessarily
adjacent to the tunnel egress endpoint.
> Although we do say that we expect the tunnel encapsulation information to
only be propagated within an administrative boundary, there is no guarantee
that the administrative boundary in question uses a unified DSCP handling
procedure throughout it.  As such, it may be possible to end up in a regime
where the requested DSCP codepoint has a different, and potentially
hazardous, interpretation, at the ingress of the tunnel.
> So, it seems that we need to say something about either local policy for
DSCP value filtering, or only using this value when "directly" connected to
the egress AS, or similar; we do have something like this already for the TC
portion of the MPLS label stack entries.
> 
> Section 3.5
> 
>    labeled address family, then the sub-TLV MUST be disregarded.  If the
>    sub-TLV is contained in a TLV whose Tunnel Type does not have a
>    virtual network identifier in its encapsulation header, the sub-TLV
>    MUST be disregarded.  In those cases where the sub-TLV is ignored, it
>    SHOULD NOT be stripped from the TLV before the route is propagated.
> 
> Why only SHOULD NOT here?  I thought we hat MUST-level requirements to
preserve things unchaged in similar situations.
> 
> Section 4.1
> 
>    In the remainder of this specification, when a route is referred to
>    as containing a Tunnel Encapsulation attribute with a TLV identifying
>    a particular Tunnel Type, it implicitly includes the case where the
>    route contains a Tunnel Encapsulation Extended Community identifying
>    that Tunnel Type.
> 
> I searched the entire document for the string "identifying" and did not
find any instances where a route was referred to as containing a Tunenl
Encapsulation attribute with a TLV identifying a particular tunnel type.
> Perhaps I should be looking for the "attribute" keyword, but there are
over 200 instances of that string; could you confirm whether this implicit
inclusion is actually used anywhere (and if so, give an example of such
usage)?
> 
> Section 6
> 
>    [RFC5512] specifies the use of the Tunnel Encapsulation attribute in
>    BGP UPDATE messages of AFI/SAFI 1/7 and 2/7.  That document restricts
>    the use of this attribute to UPDATE messages of those SAFIs.  This
>    document removes that restriction.
> 
> I believe another reviewer commented on the ambiguity of "that", which I
first thought referred to "this" vs "that"; I now see that there is
additional ambituity as to whether it is the SAFI restriction or the UPDATE
message restriction that is lifted, and suggest clarification of that as
well.
> 
>    Once it is determined to send a packet through the tunnel specified
>    in a particular Tunnel TLV of a particular Tunnel Encapsulation
>    attribute, then the tunnel's egress endpoint address is the IP
>    address contained in the sub-TLV.  If the Tunnel TLV contains a
> 
> nit: I think we have to say "Tunnel Egress Endpoint sub-TLV"; the use of
the definite article is not justified by the preceding context.
> 
> Section 8
> 
>    However, suppose that one of the TLVs in U2's Tunnel Encapsulation
>    attribute contains the Color Sub-TLV.  In that case, packet P MUST
>    NOT be sent through the tunnel contained in that TLV, unless U1 is
>    carrying the Color Extended Community that is identified in U2's
>    Color Sub-TLV.
> 
> We should probably reword this in light of Section 13's discussion that a
given Tunnel TLV can have more than one Color sub-TLV present.
> 
> Section 9.2
> 
>    Three of the tunnel types that can be specified in a Tunnel
>    Encapsulation TLV have virtual network identifier fields in their
>    encapsulation headers.  In the VXLAN encapsulation, this field is
>    called the VNI (Virtual Network Identifier) field; in the NVGRE
>    encapsulation, this field is called the VSID (Virtual Subnet
>    Identifier) field.
> 
> We start off by saying "three" types but list only two.  What's the third
type?
> 
> Section 9.2.2.2
> 
>    If the TLV identifying the tunnel does not contain an Encapsulation
>    sub-TLV whose V bit is set, the virtual network identifier field of
>    the encapsulation header is set as follows:
> 
> This perhaps sets us up for a nasty surprise in light of the
error-handling behavior in §13, where we are supposed to disregard the
second and subsequent instances of the Encapsulation sub-TLV.  This language
is not particularly clear about whether it applies to only the first sub-TLV
or all instances.
> 
>    o  If the TLV does not contain an Embedded Label Handling sub-TLV, or
>       if it contains an Embedded Label Handling sub-TLV whose value is
>       2, the embedded label is copied into the lower 3 octets of the
>       virtual network identifier field of the encapsulation header.
> 
> nit: I think "lower" is unneeded, since all the VNI fields are exactly 3
octets now.
> 
> Section 10
> 
>    Note that if the Tunnel Encapsulation attribute is attached to a VPN-
>    IP route [RFC4364], and if Inter-AS "option b" (see section 10 of
>    [RFC4364]) is being used, and if the Tunnel Egress Endpoint sub-TLV
>    contains an IP address that is not in same AS as the router receiving
>    the route, it is very likely that the embedded label has been
>    changed.  [...]
> 
> I'm not sure that I'm understanding this scenario properly.  The label has
been "changed" with respect to what baseline?  I'm also not sure why the
tunnel egress endpoint would need to be in the same AS as the router
receiving (vs originating) the route.
> 
>    Other documents may define other ways to signal tunnel information in
>    BGP.  For example, [RFC6514] defines the "P-Multicast Service
>    Interface Tunnel" (PMSI Tunnel) attribute.  In this specification, we
>    do not consider the effects of advertising the Tunnel Encapsulation
>    Attribute in conjunction with other forms of signaling tunnels.  Any
>    document specifying such joint use should provide details as to how
>    interactions should be handled.
> 
> It seems like we should perhaps go a step further and explicitly recommend
not advertising such combinations in the absence of a specification for
their combined use?  Otherwise implementations will have to come up with
their own interpretations, which could easily be uninteroperable.
> 
> Section 13
> 
>    The following sub-TLVs defined in this document MUST NOT occur more
>    than once in a given Tunnel TLV: Tunnel Egress Endpoint (discussed
>    below), Encapsulation, DS, UDP Destination Port, Embedded Label
>    Handling, MPLS Label Stack, Prefix-SID.  [...]
> 
> Ah, thanks for listing these out.  I had been wondering about this
situation earlier in the doc, and it would have helped if the "not more than
once" limitation was mentioned at each sub-TLV's definition (even if the
actual error handling stays here).
> 
> Section 14.3
> 
> Why do we only move "BGP Tunnel Encapsulation Attribute Sub-TLVs" (but not
"BGP Tunnel Encapsulation Attribute Tunnel Types") to the new "BGP Tunnel
Encapsulation Parameters" grouping?
> 
> Section 14.9
> 
> It might be useful to indicate in the registry metadata how many flags are
available (and, I suppose, in which order the bits are numbered).
> 
> Section 15
> 
> We briefly discuss in the main body text the possibility that a tunnel
will direct encapsulated traffic with (e.g) MPLS labels to a node that will
misinterpret those labels; it might be worth reiterating the risk of such
misinterpretation in the security considerations as well (or just
referencing the previous discussion as security-relevant).
> 
> I guess there's also a theoretical possibility that the flexibility in
tunnel specification (including the type of expected content) could
facilitate cross-protocol attacks, where the attacker causes the sender and
recipient of encapsulated traffic to think that they should interpret the
records in question as different protocols.  But this seems so remote, and
unlikely to succeed given different protocols'
> message structure, that it is probably not worth mentioning.
> 
> Section 18.2
> 
> If we're using RFC 5462 as a reference for a field in an MPLS label, that
seems to make it normative.
> 
> We seem to depend on procedures from RFC 6811 in a few places, which also
seems to make it normative.
> 
> 
> 
> _______________________________________________
> Idr mailing list
> Idr@ietf.org
> https://www.ietf.org/mailman/listinfo/idr
>