Re: [bess] Benjamin Kaduk's Discuss on draft-ietf-bess-evpn-bum-procedure-updates-11: (with DISCUSS and COMMENT)

Benjamin Kaduk <kaduk@mit.edu> Fri, 22 October 2021 03:07 UTC

Return-Path: <kaduk@mit.edu>
X-Original-To: bess@ietfa.amsl.com
Delivered-To: bess@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 048BC3A0AA8; Thu, 21 Oct 2021 20:07:56 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.897
X-Spam-Level:
X-Spam-Status: No, score=-1.897 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_NONE=0.001, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id Vp8vS-L_RVBT; Thu, 21 Oct 2021 20:07:51 -0700 (PDT)
Received: from outgoing.mit.edu (outgoing-auth-1.mit.edu [18.9.28.11]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 6A8893A0AA5; Thu, 21 Oct 2021 20:07:50 -0700 (PDT)
Received: from kduck.mit.edu ([24.16.140.251]) (authenticated bits=56) (User authenticated as kaduk@ATHENA.MIT.EDU) by outgoing.mit.edu (8.14.7/8.12.4) with ESMTP id 19M37cOI023112 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=NOT); Thu, 21 Oct 2021 23:07:44 -0400
Date: Thu, 21 Oct 2021 20:07:38 -0700
From: Benjamin Kaduk <kaduk@mit.edu>
To: "Jeffrey (Zhaohui) Zhang" <zzhang@juniper.net>
Cc: The IESG <iesg@ietf.org>, "draft-ietf-bess-evpn-bum-procedure-updates@ietf.org" <draft-ietf-bess-evpn-bum-procedure-updates@ietf.org>, "bess-chairs@ietf.org" <bess-chairs@ietf.org>, "bess@ietf.org" <bess@ietf.org>, "EXT-zzhang_ietf@hotmail.com" <zzhang_ietf@hotmail.com>
Message-ID: <20211022030738.GU88762@kduck.mit.edu>
References: <163479537252.27220.2471413856188998579@ietfa.amsl.com> <BL0PR05MB565267D77F58E52C11A662EFD4BF9@BL0PR05MB5652.namprd05.prod.outlook.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="iso-8859-1"
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
In-Reply-To: <BL0PR05MB565267D77F58E52C11A662EFD4BF9@BL0PR05MB5652.namprd05.prod.outlook.com>
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/GN2WLeGeJBO-StuWTAqPA20ezdU>
Subject: Re: [bess] Benjamin Kaduk's Discuss on draft-ietf-bess-evpn-bum-procedure-updates-11: (with DISCUSS and COMMENT)
X-BeenThere: bess@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bess>, <mailto:bess-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess/>
List-Post: <mailto:bess@ietf.org>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bess>, <mailto:bess-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 22 Oct 2021 03:07:57 -0000

Hi Jeffrey,

On Thu, Oct 21, 2021 at 02:11:01PM +0000, Jeffrey (Zhaohui) Zhang wrote:
> Hi Ben,
> 
> Thanks for your review and comments. Let me first address the DISCUSS point below and then follow up with another email on other points.

Sure, that sounds good.

> -----Original Message-----
> From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
> Sent: Thursday, October 21, 2021 1:50 AM
> To: The IESG <iesg@ietf.org>
> Cc: draft-ietf-bess-evpn-bum-procedure-updates@ietf.org; bess-chairs@ietf.org; bess@ietf.org; EXT-zzhang_ietf@hotmail.com <zzhang_ietf@hotmail.com>; EXT-zzhang_ietf@hotmail.com <zzhang_ietf@hotmail.com>
> Subject: Benjamin Kaduk's Discuss on draft-ietf-bess-evpn-bum-procedure-updates-11: (with DISCUSS and COMMENT)
> 
> [External Email. Be cautious of content]
> 
> 
> Benjamin Kaduk has entered the following ballot position for
> draft-ietf-bess-evpn-bum-procedure-updates-11: Discuss
> 
> When responding, please keep the subject line intact and reply to all
> email addresses included in the To and CC lines. (Feel free to cut this
> introductory paragraph, however.)
> 
> 
> Please refer to https://urldefense.com/v3/__https://www.ietf.org/blog/handling-iesg-ballot-positions/__;!!NEt6yMaO-gk!S0XsefTxpcM08Cegf69h1Kt6chs_TkO5AGOfS5hXEhy1jFTSKNgIn4vlXZV7E-lH$
> for more information about how to handle DISCUSS and COMMENT positions.
> 
> 
> The document, along with other ballot positions, can be found here:
> https://urldefense.com/v3/__https://datatracker.ietf.org/doc/draft-ietf-bess-evpn-bum-procedure-updates/__;!!NEt6yMaO-gk!S0XsefTxpcM08Cegf69h1Kt6chs_TkO5AGOfS5hXEhy1jFTSKNgIn4vlXVyyzOC6$
> 
> 
> 
> ----------------------------------------------------------------------
> DISCUSS:
> ----------------------------------------------------------------------
> 
> I'm not sure whether the Leaf A-D route (route type specific field) as
> specified by this document is guaranteed to have a unique
> interpretation (§3.3).  It's supposed to start with the "route key",
> which is just the route-type-specific field of the PMSI route that
> triggered the Leaf A-D route.  That makes the route key variable length
> (as stated), and this variation is clearly achieved given that even the
> Per-Region I-PMSI A-D route and S-PMSI A-D route defined in this
> document have different length and layout.  I'm not sure what
> information is expected to be available to bound and determine the
> length of this "route key" field, other than "not the rest of the
> stuff".  "The rest of the stuff" is the originator's address and length
> thereof, but the length is in the middle of the structure, so even if we
> start parsing from the back we still can't distinguish reliably between
> a 4-byte IPv4 address and a 16-byte IPv6 address.  It seems that the
> Originator's Addr Length would need to be the last field in order to
> provide a unique interpretation, with "parse backwards" used to extract
> the Originator's Address, and "the rest of the stuff" being the route
> key that can be matched to the triggering PMSI route.  Is there some
> other procedure or contextual information available that ensurs a unique
> interpretation of this data?  Looking at RFCs 6514 and 7117, it does not
> seem like this document has some key change that renders it
> fundamentally different in this regard, so I mostly assume that the
> received route can be disambiguated somehow; I just don't know what that
> way is.
> 
> Zzh> All routes have the following format:
> 
>                     +-----------------------------------+
>                     |    Route Type (1 octet)           |
>                     +-----------------------------------+
>                     |     Length (1 octet)              |
>                     +-----------------------------------+
>                     | Route Type specific (variable)    |
>                     +-----------------------------------+
> Zzh> For a Leaf A-D route, its "Route Type specific" part is the triggering PMSI route's NLRI, so it explodes into the following (the three lines marked with '*' are the triggering PMSI).
> 
>                    Route Type 11 (Leaf A-D)
>                    Length (of the entire Leaf A-D route NLRI)
>                    * Route Type x (for the triggering PMSI route)
>                    * Length (of the PMSI route NLRI)
>                    * Route Type specific (for the PMSI route)
>                    Originator's Addr Length (for the Leaf A-D route)
>                    Originator's Addr (for the Leaf A-D route)
> 
> Zzh> With the above, there should be no problem parsing the NLRI?

Thanks for writing it out like that.

In a sense the triggering NLRI is "self-describing" because it has its own
type and length information; that internal structure does allow for the
location of the Originator's Addr Length field to be determined and the
rest of the Leaf A-D route to be parsed.  (And any given route type is
presumably going to have a fixed internal structure, though not necessarily
a fixed-length internal structure, which also helps ensure a unique
interpretation.  The triggering route has to be parsed itself, after all!)

So in short, there is no protocol issue here, though we might want to think
about whether to include the "expanded version" of the diagram you show
above in the document itself, to avoid any other readers getting confused
in the same way that I did.

I will update my ballot position in the datatracker now, and wait for any
additional reply to the comment portion of the ballot.

Thanks again,

Ben

> Zzh> Thanks.
> Zzh> Jeffrey
> 
> ----------------------------------------------------------------------
> COMMENT:
> ----------------------------------------------------------------------
> 
> Section 1
> 
>    It is expected that audience is familiar with EVPN and MVPN concepts
>    and terminologies.  For convenience, the following terms are briefly
> 
> Please provide references for EVPN and MVPN that would serve as entry
> points for gaining such familiarity.  E.g., RFCs 7432 and 6513/6514.
> 
>    explained.
> 
> I suggest including PMSI Tunnel Attribute in this list, especially since
> RFC 6514 does not actually use the PTA acronym.
> 
> Section 2.1
> 
>    There is a difference between MVPN and VPLS multicast inter-as
>    segmentation.  For simplicity, EVPN will use the same procedures as
>    in MVPN.  All ASBRs can re-advertise their choice of the best route.
> 
> While it is defensible to rely on the stated expectation that the reader
> is familiar with EVPN and MVPN concepts (hmm, which does not actually
> include VPLS concepts?), it would be appreciated to include some
> indication of the nature of the difference, before stating which variant
> EVPN will use.
> 
>    For inter-area segmentation, [RFC7524] requires the use of Inter-area
>    P2MP Segmented Next-Hop Extended Community (S-NH-EC), and the setting
>    of "Leaf Information Required" L flag in PTA in certain situations.
>    Either of these could be optional in case of EVPN.  Removing these
> 
> "Could be"?  It sometimes is and sometimes isn't?  When is it still
> mandatory?
> 
> Section 2.1.1
> 
>    For example, an MVPN/VPLS/EVPN network may span multiple providers
>    and Inter-AS Option-B has to be used, in which the end-to-end
> 
> Is this "option (b)" of §7.2 of RFC 7117?  Regardless, a specific
> reference seems in order.
> 
>    Another advantage of the smaller region is smaller BIER sub-domains.
>    In this new multicast architecture BIER [RFC8279], packets carry a
> 
> RFC 8279 was published just about four years ago.  Does that still
> qualify as "new"?  (I honestly am not sure, given the distribution of
> time from -00 to RFC.)
> 
> Section 3
> 
>    The "Route Type specific" field of the type 9 and type 10 EVPN NLRIs
>    starts with a type 1 RD, whose Administrator sub-field MUST match
>    that of the RD in all non-Leaf A-D (Section 3.3) EVPN routes from the
>    same advertising router for a given EVI.
> 
> Is the requirement really so specific as "everything except Leaf A-D"?
> What if some new type is allocated that also doesn't start with an RD?
> Is it safe to say that any type that does have an RD must meet this
> criterion?
> 
> Section 4
> 
>    The optional optimizations specified for MVPN in [RFC8534] are also
>    applicable to EVPN when the S-PMSI/Leaf A-D routes procedures are
>    used for EVPN selective multicast forwarding.
> 
> Are we going to need a
> draft-ietf-something-bum-procedure-further-updates in another five years
> to perform the same type of "gap filling" that this document is doing
> for how RFC 7432 referred to the RFC 7117 procedures?
> 
> Section 5.1
> 
>    The above VPLS behavior requires complicated VPLS specific procedures
>    for the ASBRs to reach agreement.  For EVPN, a different approach is
>    used and the above quoted text is not applicable to EVPN.
> 
> What about the text we didn't quote, that places MUST-level requirement
> on the "best route selection procedures" that determine whether a given
> ASBR re-advertises the route within its own AS?
> 
>       "The PMSI Tunnel attribute MUST specify the tunnel for the segment.
>       If and only if, in order to establish the tunnel, the ASBR needs to
>       know the leaves of the tree, then the ASBR MUST set the L flag to
>       1 in the PTA to trigger Leaf A-D routes from egress PEs and
>       downstream ASBRs. It MUST be (auto-)configured with an import RT,
>       which controls acceptance of leaf A-D routes by the ASBR."
> 
> It seems like we might want to make some further statement about what
> scope that import RT is expected to limit acceptance of the route to.
> 
> Section 5.2
> 
>    considered as leaves (as proxies for those PEs in other ASes).  Note
>    that in case of Ingress Replication, when an ASBR re-advertises IMET
>    A-D routes to IBGP peers, it MUST advertise the same label for all
>    those for the same Ethernet Tag ID and the same EVI.  When an ingress
> 
> This seems like an eminently reasonable thing to require.  I wonder if
> it's worth saying a little more about why it is required, though -- what
> breaks if you don't do this?
> 
>    PE builds its flooding list, multiple routes might have the same
>    (nexthop, label) tuple and they MUST only be added as a single branch
>    in the flooding list.
> 
> I'm not entirely confident that I could implement this behavior for the
> flooding list right now.  On the other hand, I also haven't written any
> BGP code, so maybe it's expected that I couldn't implement it, but it
> still seems like this might be glossing over some details.
> 
> Section 5.3
> 
>    o  An egress PE sends Leaf A-D routes in response to I-PMSI routes,
>       if the PTA has the L flag set (by the re-advertising ASBRs).
> 
> I don't think I understand the parenthetical.  Which previous text is it
> intending to refer to?
> 
> Additionally, while we mention in the first paragraph of §5.2 the RFC
> 7432 requirement to not set the L flag in IMET A-D routes, I don't see
> where we lift that requirement for the segmented procedures.  The change
> in §5.1 to let the ASBR set the L flag does not seem constructed in a
> way that lifts the requirement from §11.2 of RFC7432.
> 
>    To address this backward compatibility problem, the following
>    procedure can be used (see Section 6.2 for per-PE/AS/region I-PMSI
>    A-D routes):
> 
> Can be used, but is in no way mandatory, not even to implement?  That's
> rather surprising.
> 
>    o  The ASBRs in an AS originate per-region I-PMSI A-D routes and
>       advertise to their external peers to advertise tunnels used to
>       carry traffic from the local AS to other ASes.  Depending on the
> 
> This may or may not just be a grammar nit: *what* do the ASBRs advertise
> to their external peers to advertise tunnels?  (Are we just missing
> "them" for "advertise them"?)
> 
> Section 6.3
> 
>    [RFC7524] specifies the use of S-NH-EC because it does not allow ABRs
>    to change the BGP next hop when they re-advertise I/S-PMSI A-D routes
> 
> I failed to find any place where RFC 7524 used the string "S-NH-EC", and
> suggest writing out "Segmented Next-Hop Extended Community" somewhere.
> 
> Section 9
> 
> I would posit that at least some of the security considerations from RFC
> 6513 are relevant here, in addition to the (already mentioned) 6514
> considerations.
> 
> Section 12.1
> 
> I do not think that RFC 7988 needs to be classified as normative; we
> reference it only once, in an example; RFCs 4875 and 6388 are used in
> the same way for the same example, but are classified as informative.
> 
> Section 12.2
> 
> I don't see what's different between RFCs 6513 and 6514 to make the
> latter normative while the former is informative -- they are referenced
> in largely the same places, and often.
> 
> NITS
> 
> Abstract
> 
>    This document specifies procedure updates for broadcast, unknown
>    unicast, and multicast (BUM) traffic in Ethernet VPNs (EVPN),
> 
> I'd suggest NEW:
> 
>  This document specifies updated procedures for handling broadcast,
>  unknown unicast, and multicast (BUM) traffic in Ethernet VPNs (EVPN),
> 
> Section 1
> 
>    o  IMET A-D route [RFC7432]: Inclusive Multicast Ethernet Tag A-D
>       route.  The EVPN equivalent of MVPN Intra-AS I-PMSI A-D route.
> 
> I would say that a de novo explanation is likely to be of more general
> applicability than a dense MVPN reference.  Perhaps "Advertised by PEs
> to enable reception of BUM traffic for a given VLAN" or similar?
> 
>    o  SMET A-D route [I-D.ietf-bess-evpn-igmp-mld-proxy]: Selective
>       Multicast Ethernet Tag A-D route.  The EVPN equivalent of MVPN
>       Leaf A-D route but unsolicited and untargeted.
> 
> Likewise, this might be something like "Advertised by PEs to indicate
> that the indicated BUM traffic should be sent to the advertising PE."
> 
> Section 2
> 
>    [RFC7117] specifies procedures for Multicast in Virtual Private LAN
>    Service (VPLS Multicast) using both inclusive tunnels and selective
>    tunnels with or without inter-as segmentation, similar to Multicast
>    VPN (MVPN) procedures specified in [RFC6513] and [RFC6514].
> 
> s/to Multicast/to the Multicast/
> 
> Section 2.1.1
> 
>    Segmentation can also be used to divide an AS/area to smaller
>    regions, so that control plane state and/or forwarding plane state/
> 
> s/to smaller/into smaller/
> 
> Section 4
> 
>    of egress PEs for selective forwarding with BIER).  An NVE proxies
>    the IGMP/MLD state that it learns on its ACs to (C-S,C-G) or
>    (C-*,C-G) SMET routes and advertises to other NVEs, and a receiving
> 
> I think s/and/that it/
> 
>    NVE converts the SMET routes back to IGMP/MLD messages and send them
> 
> s/send/sends/
> 
> Section 5.3
> 
>    o  An ingress PE uses the Next Hop instead of Originating Router's IP
>       Address to determine leaves for the I-PMSI tunnel.
> 
> It was not previously required to use Originating Router's IP Address,
> so maybe s/instead of/and not use/.
> 
> Section 6.1
> 
>    changes the next hop to its own address and changes PTA to specify
>    the tunnel type/identification in the neighboring region 3.  Now the
> 
> I'm not sure that we ever explicitly named the region.  We implicitly do
> in the following figure that says "segment 3", but the number and string
> "region" don't seem paired anywhere directly.
> 
> Section 6.2
> 
>    propagated to other regions.  If multiple RBRs are connected to a
>    region, then each will advertise such a route, with the same route
>    key (Section 3.1).  Similar to the per-PE I-PMSI A-D routes, RBRs/PEs
> 
> Mention of route key seems to be in §3.3, not 3.1.
> 
> Section 6.3
> 
>    NH-EC.  The advantage of this is that neither ingress nor egress PEs
>    need to understand/use S-NH-EC, and consistent procedure (based on
>    BGP next hop) is used for both inter-as and inter-region
>    segmentation.
> 
> s/consistent/a consistent/
> 
> Section 7
> 
>    including Ingress Replication.  Via means outside the scope of this
>    document, PEs know that ESI labels are from DCB and existing multi-
>    homing procedures work as is, whether a multi-homed Ethernet Segment
>    spans across segmentation regions or not.
> 
> I'm not sure this is a well-formed sentence.  Is it supposed to be a
> list?  Or just two things that PEs know OOB: (ESI labels are from
> existing multi-homing procedures work as is) and (whether or not a
> multi-homed Ethernet Segment spans across segmentation regions)?
> 
> 
> 
> 
> Juniper Business Use Only