Re: [bess] John Scudder's Discuss on draft-ietf-bess-evpn-optimized-ir-09: (with DISCUSS and COMMENT)

"Rabadan, Jorge (Nokia - US/Mountain View)" <jorge.rabadan@nokia.com> Mon, 08 November 2021 15:21 UTC

Return-Path: <jorge.rabadan@nokia.com>
X-Original-To: bess@ietfa.amsl.com
Delivered-To: bess@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 89D583A147B; Mon, 8 Nov 2021 07:21:26 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -1.891
X-Spam-Level:
X-Spam-Status: No, score=-1.891 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.001, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, HTML_MESSAGE=0.001, RCVD_IN_MSPIKE_H2=-0.001, SPF_PASS=-0.001, T_FILL_THIS_FORM_SHORT=0.01, URIBL_BLOCKED=0.001] autolearn=ham autolearn_force=no
Authentication-Results: ietfa.amsl.com (amavisd-new); dkim=pass (1024-bit key) header.d=nokia.onmicrosoft.com
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id mcfbKikUDbVP; Mon, 8 Nov 2021 07:21:18 -0800 (PST)
Received: from NAM04-DM6-obe.outbound.protection.outlook.com (mail-dm6nam08on2114.outbound.protection.outlook.com [40.107.102.114]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 1F7453A107D; Mon, 8 Nov 2021 07:20:36 -0800 (PST)
ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=P70Cu/Aaah7Tes5Wd5Hz24ShN0D6VlRZi9UJAnIh7U37fl7fny2urRUInILWwvo3N0TBj8f7ufNdJUjtPAtdlqDHXoNrkK+bamw/SpndPh/dY+TGchPeW+/0XKiidmFrAinlV7DBUd+qGmt/B0PPB3tuArvpqpA1Xed1Py/d0NjaPv6iOKfLS7s2PLR9dUasv20i8R+o7BMMDQd2Qk3VDh2itQ/O5D/zmc7ygkpkaBLajlRapP61B+ELfa1bpThPED9PBd2Oaw1EwSmkOkMh1gm63hBxAAch6IHFgZFjye+tS/d1hfwI+SYaf3DuTmcrSdB5mPPzSLpXYMuJk5QoYg==
ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=WqQ8q/mDMIJWtX6WW6N4BRTssW+1G2jWvMPoRT+cvdw=; b=ledKMlRUClK58fICuU0cUtGmmtw5/Fd++3hmA7l5NVN3nhwak47BFVjRtfYLyPI6e67llylEbEYdVFdSKJQPWGfz/rEr4/+XufxsjZ8Y0lIqwv+GDQF9qta8ZOF6Qv2QZQUnV8FNtGyxlKHiIis2LopBMFFPlTUek4DoHU5bitsAlxv+5Dd6uCvZcRrWKxILxvEswPSEGImVqH3tjrHVecVf5XHQLEGhtDsyprDkzwHzEmWQf97y237zZUfygTjUe9QTtacaO6xfdZBH8a1X+3TV9kp7+mkTyqkYTCTTjbZgupMgcQDg0v+qTiumKAM+f605DWbnXGLfi5x6AEBOig==
ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=nokia.com; dmarc=pass action=none header.from=nokia.com; dkim=pass header.d=nokia.com; arc=none
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=nokia.onmicrosoft.com; s=selector1-nokia-onmicrosoft-com; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=WqQ8q/mDMIJWtX6WW6N4BRTssW+1G2jWvMPoRT+cvdw=; b=jRkRH5COmIVBhhUg+6RYyaG9fhkabSVmiiicndFLhsRMDDQa9lGRlP/ZrGPO80mchj3wFZnUgVobftbBSvIYxyQL4VrqTS/UtVXN9yZRHrHQFHuGv46je7Uvv2mLc1nJjynodZpmXxxBoVLUeB7O5dPlE9VT9OUnWccgvF9RsNM=
Received: from BY3PR08MB7060.namprd08.prod.outlook.com (2603:10b6:a03:36d::19) by BYAPR08MB4245.namprd08.prod.outlook.com (2603:10b6:a03::15) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.20.4669.11; Mon, 8 Nov 2021 15:20:31 +0000
Received: from BY3PR08MB7060.namprd08.prod.outlook.com ([fe80::c481:f856:9121:e]) by BY3PR08MB7060.namprd08.prod.outlook.com ([fe80::c481:f856:9121:e%7]) with mapi id 15.20.4669.016; Mon, 8 Nov 2021 15:20:31 +0000
From: "Rabadan, Jorge (Nokia - US/Mountain View)" <jorge.rabadan@nokia.com>
To: John Scudder <jgs@juniper.net>, The IESG <iesg@ietf.org>, "draft-ietf-bess-evpn-optimized-ir@ietf.org" <draft-ietf-bess-evpn-optimized-ir@ietf.org>, "bess-chairs@ietf.org" <bess-chairs@ietf.org>, "bess@ietf.org" <bess@ietf.org>, "Bocci, Matthew (Nokia - GB)" <matthew.bocci@nokia.com>
Thread-Topic: John Scudder's Discuss on draft-ietf-bess-evpn-optimized-ir-09: (with DISCUSS and COMMENT)
Thread-Index: AQHXxhfMrlBD4AIth0id/PNXgGqUQ6vyb5sV
Date: Mon, 08 Nov 2021 15:20:31 +0000
Message-ID: <BY3PR08MB706048E96EC99525C286418CF78C9@BY3PR08MB7060.namprd08.prod.outlook.com>
References: <163477834717.27602.11452549676478352862@ietfa.amsl.com>
In-Reply-To: <163477834717.27602.11452549676478352862@ietfa.amsl.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
authentication-results: juniper.net; dkim=none (message not signed) header.d=none;juniper.net; dmarc=none action=none header.from=nokia.com;
x-ms-publictraffictype: Email
x-ms-office365-filtering-correlation-id: ab054600-6fa0-4f76-eaef-08d9a2cb4de5
x-ms-traffictypediagnostic: BYAPR08MB4245:
x-microsoft-antispam-prvs: <BYAPR08MB42452882ED3A7F9C83F7C08EF7919@BYAPR08MB4245.namprd08.prod.outlook.com>
x-ms-oob-tlc-oobclassifiers: OLM:10000;
x-ms-exchange-senderadcheck: 1
x-ms-exchange-antispam-relay: 0
x-microsoft-antispam: BCL:0;
x-microsoft-antispam-message-info: CNSAw7Sas6MAdyZUdoZcb7U6SpKacfUVBsOiDu2vnnuDTTSfewtkuV314vhxVKv/PcjP52gbLzSXQa6FZYFO3eE0s8BRiBxadVQmCdAVqGCtKJibibfvXT5WuG5KT7+dzgG7HBKqfBDi8WqMYVpQjFG2aoKTFUrmhhUcpjTBG8RrOZpOQawhymWcVjE9scwFc2RovD4M9aSawpG1igN3q1tJZOcXxEzClpOsLzWJ2ktw1tYUQSjacjry0U0ZN5eidCW46Wwfj9PRBl/W1vwC+u1k8Xi3UpnNigHstf1SAWDGSYeyugtUsAqPlfxR47zqzx58I7l/Sw60tCQssiF4ui2Jr0DWhTolF8A2ARePOhrjPQScaexWQPd4t3/ib3B+gIYFOFMJzpzt8wj4gmwttahGNcwpstfXrzmVkXStLtQqDNpxTYS9WwJxnWrgiE0AlVBtjMm84wYS/yACgbyXQULoTVu2B7DzcgMm2c/EsRD1NF1kdaqvxkioe8VrYmkA7sz2aOiLlLmuGbMoaThXzkazK3NiE3nKoNXYWe8ksheD0i1fvwhQJ7Je1b4Fas8ycdBCvt9VkPk4HEuiAJe6T/FmpVLGzUmIg7OZ4NrR6oK3s/5AX06qgiQcuf+Y7YmP4wyoy3qptU8oG3WBLpM+USnH0XkGuXvoOuHD0BwxOt9EwN58DnzWlMrv6Ec8k/ou9or9Wd8Qa18r8vL+9dX8hGEVlfbwctmwaTbiIUPPaI1JOMXauCjdZKJuvoRnqIIfxarx08E+J/rZXpxC8AO3BMuEgNU5AdeBNBAZEk2a2QcoMDDOL+twQ/n7EhIbuNgAXbtRo+TJqJW60ztxES44wQ==
x-forefront-antispam-report: CIP:255.255.255.255; CTRY:; LANG:en; SCL:1; SRV:; IPV:NLI; SFV:NSPM; H:BY3PR08MB7060.namprd08.prod.outlook.com; PTR:; CAT:NONE; SFS:(4636009)(366004)(66946007)(55016002)(508600001)(9686003)(8936002)(64756008)(66556008)(91956017)(40140700001)(966005)(110136005)(66476007)(66446008)(8676002)(38100700002)(33656002)(316002)(2906002)(4001150100001)(26005)(30864003)(83380400001)(5660300002)(86362001)(122000001)(186003)(9326002)(166002)(6506007)(7696005)(82960400001)(66574015)(52536014)(76116006)(38070700005)(53546011)(71200400001)(6636002)(559001)(579004); DIR:OUT; SFP:1102;
x-ms-exchange-antispam-messagedata-chunkcount: 1
x-ms-exchange-antispam-messagedata-0: AlrCPpe5CVSUZxnFuiYd19Ys8oYiF2XneNd1JB6tW5rhuFnjTZmTNKxUT9f174EAB+hlPJUPiUDfofgwTXtPFZsLnLHIIxjB+locChmGnGCA3ijhmBsbdJ6CIwtzQFQVhdx5uV/k8NqIYVkH3nsMu1yvz6Dq3t0OWl37yjQ4yF+MUucCV5V96ZJfjqR9vatkcp4yu55bzLUCwEyX+k9hfgOaUM/GINKYBTV4HqfOXfoo3v9LiYixXzi8CPMzpPuPHkqY5plXDJ+0Pud/ib9Y3kPMNNerRwH6AJXYHVlDLpzGFHRLjkrI2mF1lE/zZQW9hOfBjM/rz6q2d6x1cLEupNOboxBSZjI1gYh7t1ojPEdc9U9aKgZ715bmholTRMAAn7jCoqIHWttja+JZovK58Lho+mjW2BaXx1cy3e+fj3GAX5CWLfp/Yy2WlPT98/OCfKGDR+avPfqlhb8Iith5RyBgaRf8HSB/BHV0xhrT90n/5S2nhtE0tG6f+J5TtBLEOlbH0uCZCEVLKV7ZhV9GMKozJmOms31RXjjOswxl58vskeRK0x20WDREQ7eold4F3G7FcOlvsE5FdBLo12QuZ6/FgYLVP9aVxVSMkGQaETs/QMAessCzCFipGwsgE3xcDomR1QzCqrLehV+zdLbcd+xOKGLLLwGvlQ8T3hD9KT/VN+Kk97j1sWSpRIjuPzQAz6sjTVx/XjwwAD9hsOeA5Li+/Zawb0gO1X8hI2bE5zI/1cHb4jQ4sK72mrYu4k/lr0bwzrlqHHcu7RrZ4gjSx3AAzDzuzzsTus28hg8JTo+D185kPGN9GmpRsUbGAQwJ7ny1DNOqB0A7Tr4I/U3vsIj+e3LKvptZQ0QINkA679H9rrE8zQ+IFY3FIK8ro3Yftq9dol9GMcz2poz7vhSvCuDnF3+MafgGo2w7EKNRgyvPzK0AGTsKShDImUMbhhWcDGmzrf4EQ8aHSpyiXbMj0jHq/Tl7RKTmJHxMGV7v+WfCYihuLLwO+8HKRArzetuuRTQB5OaEK3DkG61v1EtT8tw9IJ1DRgCFy32i7mP//TgvR0clfILu+cJHj56JxqwKQ7T7jXM6fGgTzRauGEpLhaAjsH+KlFGo2g3KpT73Xt9gF7BHCXBs+v6uZLf7rFD+9q5358fYlyCUi88SfgWFY7G6xYxjnWOiW3ny/gWmQFo7zcVTLqgrLMCXOUXS77eXunfscxwvb+IB6IXTOBNSVWcLEXIBmsUnH9U7B4IW01/I5pR8eD1G7esLDf8TLuqrJ6ea9zr2346dXm7geHRAxHVEe5N9kxPLTsY9hDtdAjmrlL0Smd5HaOPVY7T6HYR6CWetNOk+LSdbQ6cWXLPuuFKqpYJBjqTBWO43O42vGY5e4mvBZeRwZdPzjnwO4m+ouz1NiyeEl4ei1sVP1vTJY5FEEkqZlXWs5GJP5iZnru7GMwoIWMhf2dDwsfYQ5MCjn3+G6spTdNkqAXdrfpdKLpV0aF9uhFJ6VPWCR6VxixM3jcG7PeGkRJTx93JAM00Z3PVIpfO70dWVkdgMabTcNky29DXu2YV9RJ4Nrs5iR6PhuGYcgNhWg8gQv/oZLEEmBzwR7CG6MXkb1zSHPGktl5BC2Y0Cdp0lOQLUVSYp/0At75glhyUtltRuYvohWL8Z
Content-Type: multipart/alternative; boundary="_000_BY3PR08MB706048E96EC99525C286418CF78C9BY3PR08MB7060namp_"
MIME-Version: 1.0
X-OriginatorOrg: nokia.com
X-MS-Exchange-CrossTenant-AuthAs: Internal
X-MS-Exchange-CrossTenant-AuthSource: BY3PR08MB7060.namprd08.prod.outlook.com
X-MS-Exchange-CrossTenant-Network-Message-Id: ab054600-6fa0-4f76-eaef-08d9a2cb4de5
X-MS-Exchange-CrossTenant-originalarrivaltime: 08 Nov 2021 15:20:31.7285 (UTC)
X-MS-Exchange-CrossTenant-fromentityheader: Hosted
X-MS-Exchange-CrossTenant-id: 5d471751-9675-428d-917b-70f44f9630b0
X-MS-Exchange-CrossTenant-mailboxtype: HOSTED
X-MS-Exchange-CrossTenant-userprincipalname: 0ZmsHqn5A2VEfPztTioVRDvlDwaCrU7D49ilIkuJj7m03hAZ5tQg5Jw2tmYa87cSzhPaXSDFv0wqxOddMhFdzg==
X-MS-Exchange-Transport-CrossTenantHeadersStamped: BYAPR08MB4245
Archived-At: <https://mailarchive.ietf.org/arch/msg/bess/7gf-1ItQC37ciol_RMy5P4e63mw>
Subject: Re: [bess] John Scudder's Discuss on draft-ietf-bess-evpn-optimized-ir-09: (with DISCUSS and COMMENT)
X-BeenThere: bess@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: BGP-Enabled ServiceS working group discussion list <bess.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bess>, <mailto:bess-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bess/>
List-Post: <mailto:bess@ietf.org>
List-Help: <mailto:bess-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bess>, <mailto:bess-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 08 Nov 2021 15:21:33 -0000

Hi John,

First of all, thank you very much for your time and thorough review. You have great points, and the document is now in a much better shape. We really appreciate it.
Please see in-line along your email, with [jorge].

Thank you.
Jorge

From: John Scudder via Datatracker <noreply@ietf.org>
Date: Wednesday, October 20, 2021 at 6:05 PM
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-bess-evpn-optimized-ir@ietf.org <draft-ietf-bess-evpn-optimized-ir@ietf.org>, bess-chairs@ietf.org <bess-chairs@ietf.org>, bess@ietf.org <bess@ietf.org>, Bocci, Matthew (Nokia - GB) <matthew.bocci@nokia.com>, Bocci, Matthew (Nokia - GB) <matthew.bocci@nokia.com>
Subject: John Scudder's Discuss on draft-ietf-bess-evpn-optimized-ir-09: (with DISCUSS and COMMENT)
John Scudder has entered the following ballot position for
draft-ietf-bess-evpn-optimized-ir-09: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/blog/handling-iesg-ballot-positions/
for more information about how to handle DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-bess-evpn-optimized-ir/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

DISCUSS:

In my review there are a number of comments/questions that I would like
to be sure of having discussed. In particular, the questions about use
of SHOULD without associated discussion of exception cases.  I would
also like to be make sure the question about whether non-BM receivers
are, or are not, excluded from flood lists (§6.1) has been addressed.

Of course I would also appreciate replies to my other comments! :-)
[jorge] sure. Please see below.



----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

Overall comments:

1. This document suffers from what I think is an overuse of
   abbreviations.  See
   https://www.psychologicalscience.org/observer/alienating-the-audience-how-abbreviations-hamper-scientific-communication
   for one perspective on why this is problematic.  Any individual one of these
   doesn't rise to the level of being objectionable, but in aggregate at some
   point it makes the document a lot less accessible to anyone who isn't part
   the in-group who has memorized the abbreviations. 4r xm, <- ... is !@? to
   rd, 4r no gd rn, [see terminology section below] even though anyone who goes
   to the effort of looking up the terminology can decode it.  I would really
   prefer it if this were improved; I think it's not that much work for the
   authors and will make the resulting spec more usable.  I had intended to
   offer an example edit that expands many of the abbreviations, but have run
   out of time; I'd still be willing to do it later if requested, let me know.

   (Consider also the contrast with RFC 6514; for instance instead of
   referring to "the L-flag", when mentioning that flag it says "the Leaf
   Information Required flag".  Since we don't pay by the byte for publishing
   our documents, it seems to be worth spending a few more keystrokes to
   make it easier to read them.)
[jorge] that’s a fair comment and a fresh view. We, authors, are familiar with the acronyms and we don’t realize, but when looking at the document again from a different perspective, you are 100% right. I definitely think the document is more readable now:

  *   we reduced the number of abbreviations
  *   we ordered the terminology in section 2


2. The document starts in the middle.  It jumps right from the requirements
   to the tunnel attribute diagram, with no overview or outline of the
   solution.  This is related to Pascal's review comment, mentioned by Éric
   Vyncke.
[jorge] the introduction has been extended/re-written including an outline of the solution. Thanks!


Terminology:

   4r: for
   xm: example
   <-: this
   ...: sentence
   !@?: difficult
   rd: read
   gd: good
   rn: reason
[jorge] got you 😊


Detailed review:

I’ve done my comments in the form of an edited copy of the draft.  I
don't think the datatracker tooling allows me to use attachments, so
I'll follow up to this with an email with attached edited copy, as well
as a PDF of the rfcdiff output for your convenience if you’d like to use
it. I’ve also pasted a traditional diff below to capture the comments
for the record and in case you want to use it for in-line reply. I’d
appreciate feedback regarding whether you found this a useful way to
receive my comments as compared to a more traditional numbered list of
comments with selective quotation from the draft.
[jorge] the pdf with the side to side diff and the copy below helped greatly. We really appreciate you took the time to provide the diff and the comments in the edited text thinking about how it could help better. Thank you very much. Please see my comments in-line below. If I don’t have comments it means we changed the text as you asked.


*** draft-ietf-bess-evpn-optimized-ir-09.txt    2021-10-20 13:48:15.000000000
-0400 --- draft-ietf-bess-evpn-optimized-ir-09-jgs-markup.txt 2021-10-20
20:39:39.000000000 -0400
***************
*** 19,25 ****

  Abstract

!    Network Virtualization Overlay (NVO) networks using EVPN as control
     plane may use Ingress Replication (IR) or PIM (Protocol Independent
     Multicast) based trees to convey the overlay Broadcast, Unknown
     unicast and Multicast (BUM) traffic.  PIM provides an efficient
--- 19,25 ----

  Abstract

!    Network Virtualization Overlay (NVO) networks using EVPN as their control
     plane may use Ingress Replication (IR) or PIM (Protocol Independent
     Multicast) based trees to convey the overlay Broadcast, Unknown
     unicast and Multicast (BUM) traffic.  PIM provides an efficient
***************
*** 105,111 ****

     Ethernet Virtual Private Networks (EVPN) may be used as the control
     plane for a Network Virtualization Overlay (NVO) network.  Network
!    Virtualization Edge (NVE) devices and Provider Edges (PEs) that are

--- 105,111 ----

     Ethernet Virtual Private Networks (EVPN) may be used as the control
     plane for a Network Virtualization Overlay (NVO) network.  Network
!    Virtualization Edge (NVE) and Provider Edge (PEs) devices that are

***************
*** 182,187 ****
--- 182,191 ----
     "OPTIONAL" in this document are to be interpreted as described in BCP
     14 [RFC2119] [RFC8174] when, and only when, they appear in all
     capitals, as shown here.
+
+ Is there any logic to the order in which the terms are presented? If so,
+ it escaped me. It would have been much better for my reading of the document,
+ if the terms had been given in alphabetical order, for obvious reasons.
[jorge] alphabetic order is now in place, thanks


     The following terminology is used throughout the document:

***************
*** 236,241 ****
--- 240,247 ----
        Replicator-AR route.  It is used to identify the ingress packets
        that must follow AR procedures ONLY in the Single-IP AR-REPLICATOR
        case.
+
+ A reference to section 8 would be helpful in the above.

     -  IR-VNI: VNI advertised along with the RT-3 for IR.

***************
*** 288,296 ****
--- 294,313 ----
     hereafter) meets the following requirements:

     a.  It provides an IR optimization for BM (Broadcast and Multicast)
+
+ Thank you for expanding "BM", but... you've already defined it in your
+ Terminology section, so maybe you don't need to define it again. (But see
+ also my general comment on the subject of abbreviations in general;
+ depending on how we resolve that this comment may be overtaken by events.)
+
         traffic without the need for PIM, while preserving the packet
         order for unicast applications, i.e., known and unknown unicast
         traffic should follow the same path.  This optimization is
+
+ ... the same path as what? If you mean unknown should follow the same path
+ as known, then use "... i.e., unknown unicast traffic should follow the same
+ path as known unicast traffic". If you mean something different, what is it?
+
         required in low-performance NVEs.

     b.  It reduces the flooded traffic in NVO networks where some NVEs do
***************
*** 361,369 ****
--- 378,403 ----

     The Flags field is 8 bits long.  This document defines the use of 4
     bits of this Flags field:
+
+ It would be quite helpful to include a diagram of the Flags field as in
+ RFC 6514 §5:
+
+    The Flags field has the following format:
+
+        0 1 2 3 4 5 6 7
+       +-+-+-+-+-+-+-+-+
+       |  reserved   |L|
+       +-+-+-+-+-+-+-+-+
+
+ except of course with all the new and previously-defined flags filled
+ in too.

     -  bits 3 and 4, forming together the Assisted-Replication Type (T)
        field
+
+ Up here you call it the Assisted-Replication Type field.  Just a few lines
+ later you call it the AR Type field.  Can you make up your mind and use
+ one or the other, please?

     -  bit 5, called the Broadcast and Multicast (BM) flag

***************
*** 406,411 ****
--- 440,448 ----
     -  Flag L is an existing flag defined in [RFC6514] (L=Leaf
        Information Required) and it will be used only in the Selective AR
        Solution.
+
+ I think it would be nice to provide the bit position for this flag, as in
+ "(L=Leaf Information Required, bit 7)"

     Please refer to Section 11 for the IANA considerations related to the
     PTA flags.
***************
*** 420,436 ****
--- 457,497 ----
        address that we denominate IR-IP in this document.  When
        advertised by an AR-LEAF node, the Regular-IR route SHOULD be
        advertised with type T= AR-LEAF.
+
+ Your use of SHOULD implies there is at least one case where a reasonable
+ implementation could choose to advertise a Regular-IR route from an
+ AR-LEAF node with a different type.  I am left to guess what the case is,
+ and what value it should choose then.  Maybe it would use RNVE instead?
+ Please say something about this.  On the other hand if there isn't any
+ such case, this should be a MUST.
[jorge] the logic for using a SHOULD was that, even if the AR-LEAF does not set the T=AR-LEAF, the procedures in the document would still work, however, I changed it for MUST. If the AR-LEAF is an RNVE, then it is not an AR-LEAF… so I think it is better to use a MUST.


     -  Replicator-AR route: this route is used by the AR-REPLICATOR to
        advertise its AR capabilities, with the fields set as follows:

        o  Originating Router's IP Address MUST be set to an IP address of
           the PE that should be common to all the EVIs on the PE (usually
+
+ What's "the PE" in this context?  I'm assuming it means "the advertising
+ router".  If that's right, please say that instead of "the PE".
+
           this is the PE's loopback address).  The Tunnel Identifier and
           Next-Hop SHOULD be set to the same IP address as the
           Originating Router's IP address when the NVE/PE originates the
           route.  The Next-Hop address is referred to as the AR-IP and
           SHOULD be different than the IR-IP for a given PE/NVE.
+
+ Similar question to my earlier one about the two SHOULDs above.  I guess
+ in the case of the second SHOULD, it MAY be the same in the case of a
+ router unable to support two different IP addresses for this purpose, in
+ which case the procedures of Section 8 MUST be applied?  If that's right,
+ please add language to that effect.
+
+ As for the first SHOULD, does this imply that the Tunnel Identifier and
+ Next-Hop MAY be set to the IP address of some other router?
+
+ Also, "when the NVE/PE originates the route" -- in this section aren't
+ we always talking about the NVE/PE originating the route?  This clause
+ makes me think there is another case, but I can't figure out what it is.
[jorge] all good points, thanks. The new text reads as follows:

   -  Replicator-AR route: this route is used by the AR-REPLICATOR to

      advertise its AR capabilities, with the fields set as follows:



      o  Originating Router's IP Address MUST be set to an IP address of

         the advertising router that is common to all the EVIs on the PE

         (usually this is a loopback address of the PE).



         +  The Tunnel Identifier and Next-Hop SHOULD be set to the same

            IP address as the Originating Router's IP address when the

            NVE/PE originates the route, that is, when the NVE/PE is not

            an ASBR as in section 10.2 of [RFC8365].  Irrespective of

            the values in the Tunnel Identifier and Originating Router's

            IP Address fields, the ingress NVE/PE will process the

            received Replicator-AR route and will use the IP Address in

            the Next-Hop field to create IP tunnels to the AR-

            REPLICATOR.



         +  The Next-Hop address is referred to as the AR-IP and MUST be

            different from the IR-IP for a given PE/NVE, unless the

            procedures in Section 8 are followed.




        o  Tunnel Type = Assisted-Replication Tunnel.  Section 11 provides
           the allocated type value.
***************
*** 440,446 ****
        o  L (Leaf Information Required) = 0 (for non-selective AR) or 1
           (for selective AR).

!    In addition, this document also uses the Leaf A-D route (RT-11)
     defined in [I-D.ietf-bess-evpn-bum-procedure-updates] in case the

--- 501,507 ----
        o  L (Leaf Information Required) = 0 (for non-selective AR) or 1
           (for selective AR).

!    In addition, this document also uses the Leaf Auto-Discovery (A-D) route
(RT-11)
     defined in [I-D.ietf-bess-evpn-bum-procedure-updates] in case the

***************
*** 452,457 ****
--- 513,522 ----

     selective AR mode is used.  The Leaf A-D route MAY be used by the AR-
     LEAF in response to a Replicator-AR route (with the L flag set) to
+
+ The above is ambiguous.  Maybe "An AR-LEAF MAY send a Leaf A-D route in
+ response to reception of a Replicator-AR route whose L flag is set."?
+
     advertise its desire to receive the BM traffic from a specific AR-
     REPLICATOR.  It is only used for selective AR and its fields are set
     as follows:
***************
*** 459,466 ****
--- 524,538 ----

        o  Originating Router's IP Address is set to the advertising PE's
+
+ What's "the PE" in this context?  I'm assuming it means "the advertising
+ router".  If that's right, please say that instead of "the PE".
+
           IP address (same IP used by the AR-LEAF in regular-IR routes).
           The Next-Hop address is set to the IR-IP.
+
+ ... and the IR-IP is different from the "advertising PE's IP address" I
+ guess?

        o  Route Key is the "Route Type Specific" NLRI of the Replicator-
           AR route for which this Leaf A-D route is generated.
***************
*** 477,483 ****

        o  The Leaf A-D route MUST include the PMSI Tunnel attribute with
           the Tunnel Type set to AR, type set to AR-LEAF and the Tunnel
!          Identifier set to the IP of the advertising AR-LEAF.  The PMSI
           Tunnel attribute MUST carry a downstream-assigned MPLS label or
           VNI that is used by the AR-REPLICATOR to send traffic to the
           AR-LEAF.
--- 549,555 ----

        o  The Leaf A-D route MUST include the PMSI Tunnel attribute with
           the Tunnel Type set to AR, type set to AR-LEAF and the Tunnel
!          Identifier set to the IP address of the advertising AR-LEAF.  The
PMSI
           Tunnel attribute MUST carry a downstream-assigned MPLS label or
           VNI that is used by the AR-REPLICATOR to send traffic to the
           AR-LEAF.
***************
*** 488,494 ****

     Each node attached to the BD may understand and process the BM/U
     flags.  Note that these BM/U flags may be used to optimize the
!    delivery of multi-destination traffic and its use SHOULD be an
     administrative choice, and independent of the AR role.

     Non-optimized-IR nodes will be unaware of the new PMSI attribute flag
--- 560,566 ----

     Each node attached to the BD may understand and process the BM/U
     flags.  Note that these BM/U flags may be used to optimize the
!    delivery of multi-destination traffic and their use SHOULD be an
     administrative choice, and independent of the AR role.

     Non-optimized-IR nodes will be unaware of the new PMSI attribute flag
***************
*** 512,518 ****
     AR function is enabled.  Three different roles are defined for a
     given BD: AR-REPLICATOR, AR-LEAF and RNVE (Regular NVE).  The
     solution is called "non-selective" because the chosen AR-REPLICATOR
!    for a given flow MUST replicate the BM traffic to 'all' the NVE/PEs
     in the BD except for the source NVE/PE.

                             (           )
--- 584,590 ----
     AR function is enabled.  Three different roles are defined for a
     given BD: AR-REPLICATOR, AR-LEAF and RNVE (Regular NVE).  The
     solution is called "non-selective" because the chosen AR-REPLICATOR
!    for a given flow MUST replicate the BM traffic to all the NVE/PEs
     in the BD except for the source NVE/PE.

                             (           )
***************
*** 567,572 ****
--- 639,658 ----
     An AR-REPLICATOR is defined as an NVE/PE capable of replicating
     ingress BM (Broadcast and Multicast) traffic received on an overlay
     tunnel to other overlay tunnels and local Attachment Circuits (ACs).
+
+ This is different from the definition you have in the terminology section,
+ which is:
+
+    -  AR-REPLICATOR: Assisted Replication - REPLICATOR, refers to an
+       NVE/PE that can replicate Broadcast or Multicast traffic received
+       on overlay tunnels to other overlay tunnels.
+
+ In the definition here, you mention local attachment circuits, in §2 you
+ don't.  Probably you should harmonize these definitions.  Having done so,
+ it's not clear to me that you need to repeat the definition here (though
+ if you think you need to remind the reader of what you already told them,
+ it's OK).
+
     The AR-REPLICATOR signals its role in the control plane and
     understands where the other roles (AR-LEAF nodes, RNVEs and other AR-
     REPLICATORs) are located.  A given AR-enabled BD service may have
***************
*** 584,608 ****
         generate a Regular-IR route if it does not have local attachment
         circuits (AC).  If the Regular-IR route is advertised, the AR
         Type field is set to zero.

     c.  The Replicator-AR and Regular-IR routes are generated according
         to section 3.  The AR-IP and IR-IP used by the AR-REPLICATOR are
         different routable IP addresses.

     d.  When a node defined as AR-REPLICATOR receives a BM packet on an
!        overlay tunnel, it will do a tunnel destination IP lookup and
         apply the following procedures:

!        o  If the destination IP is the AR-REPLICATOR IR-IP Address the
            node will process the packet normally as in [RFC7432].

!        o  If the destination IP is the AR-REPLICATOR AR-IP Address the
            node MUST replicate the packet to local ACs and overlay
            tunnels (excluding the overlay tunnel to the source of the
            packet).  When replicating to remote AR-REPLICATORs the tunnel
!           destination IP will be an IR-IP.  That will be an indication
            for the remote AR-REPLICATOR that it MUST NOT replicate to
!           overlay tunnels.  The tunnel source IP used by the AR-
            REPLICATOR MUST be its IR-IP when replicating to either AR-
            REPLICATOR or AR-LEAF nodes.

--- 670,705 ----
         generate a Regular-IR route if it does not have local attachment
         circuits (AC).  If the Regular-IR route is advertised, the AR
         Type field is set to zero.
+
+ Do you mean "... the AR Type field of the Replicator-AR route MUST be
+ set to zero"?  If so, please say that.
[jorge] actually we mean the Regular-IR route. Changed it to:

If the Regular-IR route is advertised, the

       Assisted-Replication Type field of the Regular-IR route MUST be

       set to zero.



     c.  The Replicator-AR and Regular-IR routes are generated according
         to section 3.  The AR-IP and IR-IP used by the AR-REPLICATOR are
+
+ I think you mean Section 4?
+
         different routable IP addresses.
+
+ I think you'll find that "routable IP address" isn't a well-defined
+ term (for example I'm sure you're NOT talking specifically about non-
+ RFC 1918 addresses).  Can you choose different language here to say
+ what you mean?
[jorge] changed to:

   c.  The Replicator-AR and Regular-IR routes are generated according

       to Section 4.  The AR-IP and IR-IP are different IP addresses

       owned by the AR-REPLICATOR.



     d.  When a node defined as AR-REPLICATOR receives a BM packet on an
!        overlay tunnel, it will do a tunnel destination IP address lookup and
         apply the following procedures:

!        o  If the destination IP address is the AR-REPLICATOR IR-IP Address the
            node will process the packet normally as in [RFC7432].

!        o  If the destination IP address is the AR-REPLICATOR AR-IP Address the
            node MUST replicate the packet to local ACs and overlay
            tunnels (excluding the overlay tunnel to the source of the
            packet).  When replicating to remote AR-REPLICATORs the tunnel
!           destination IP address will be an IR-IP.  That will be an indication
            for the remote AR-REPLICATOR that it MUST NOT replicate to
!           overlay tunnels.  The tunnel source IP address used by the AR-
            REPLICATOR MUST be its IR-IP when replicating to either AR-
            REPLICATOR or AR-LEAF nodes.

***************
*** 628,642 ****
        and remote NVE/PEs), skipping the non-BM overlay tunnels.

     -  When an AR-REPLICATOR receives a BM packet on an overlay tunnel,
!       it will check the destination IP of the underlay IP header and:

!       o  If the destination IP matches its AR-IP, the AR-REPLICATOR will
           forward the BM packet to its flooding list (ACs and overlay
           tunnels) excluding the non-BM overlay tunnels.  The AR-
!          REPLICATOR will do source squelching to ensure the traffic is
           not sent back to the originating AR-LEAF.

!       o  If the destination IP matches its IR-IP, the AR-REPLICATOR will
           skip all the overlay tunnels from the flooding list, i.e.  it
           will only replicate to local ACs.  This is the regular IR
           behavior described in [RFC7432].
--- 725,742 ----
        and remote NVE/PEs), skipping the non-BM overlay tunnels.

     -  When an AR-REPLICATOR receives a BM packet on an overlay tunnel,
!       it will check the destination IP address of the underlay IP header and:

!       o  If the destination IP address matches its AR-IP, the AR-REPLICATOR
will
           forward the BM packet to its flooding list (ACs and overlay
           tunnels) excluding the non-BM overlay tunnels.  The AR-
!          REPLICATOR will ensure the traffic is
           not sent back to the originating AR-LEAF.
+
+ Above, I suggested the removal of "do source squelching" since AFAICT
+ it removes jargon while leaving the intention clear.

!       o  If the destination IP address matches its IR-IP, the AR-REPLICATOR
will
           skip all the overlay tunnels from the flooding list, i.e.  it
           will only replicate to local ACs.  This is the regular IR
           behavior described in [RFC7432].
***************
*** 645,650 ****
--- 745,754 ----
        is different for BM traffic, as far as Unknown unicast traffic
        forwarding is concerned, AR-LEAF nodes behave exactly in the same
        way as AR-REPLICATORs do.
+
+ I'm unclear why you're defining the behavior of AR-LEAF nodes here, when
+ you started by saying "An AR-REPLICATOR will follow..."  Surely, defining
+ AR-LEAF behavior here is misplaced?

     -  The AR-REPLICATOR/LEAF nodes will build an Unknown unicast flood-
        list composed of ACs and overlay tunnels to the IR-IP Addresses of
***************
*** 655,660 ****
--- 759,767 ----
        o  When an AR-REPLICATOR/LEAF receives an unknown packet on an AC,
           it will forward the unknown packet to its flood-list, skipping
           the non-U overlay tunnels.
+
+ Possibly the term "unknown packet" is well-understood by the target
+ audience, but I think it needs either an explanation or a reference here.

        o  When an AR-REPLICATOR/LEAF receives an unknown packet on an
           overlay tunnel will forward the unknown packet to its local ACs
***************
*** 688,696 ****
     b.  In this non-selective AR solution, the AR-LEAF MUST advertise a
         single Regular-IR inclusive multicast route as in [RFC7432].  The
         AR-LEAF SHOULD set the AR Type field to AR-LEAF.  Note that
!        although this flag does not make any difference for the egress
         nodes when creating an EVPN destination to the AR-LEAF, it is
!        RECOMMENDED to use this flag for an easy operation and
         troubleshooting of the BD.

     c.  In a service where there are no AR-REPLICATORs, the AR-LEAF MUST
--- 795,803 ----
     b.  In this non-selective AR solution, the AR-LEAF MUST advertise a
         single Regular-IR inclusive multicast route as in [RFC7432].  The
         AR-LEAF SHOULD set the AR Type field to AR-LEAF.  Note that
!        although this field does not make any difference for the egress
         nodes when creating an EVPN destination to the AR-LEAF, it is
!        RECOMMENDED to use this field for an easy operation and
         troubleshooting of the BD.

     c.  In a service where there are no AR-REPLICATORs, the AR-LEAF MUST
***************
*** 701,706 ****
--- 808,816 ----
         IGP or any other detection mechanism).  Ingress replication MUST
         use the forwarding information given by the remote Regular-IR
         Inclusive Multicast Routes as described in [RFC7432].
+
+ I found the above paragraph to be confusing.  Does it boil down to,
+ if there are no AR-REPLICATORS, use regular IR?
[jorge] pretty much. I tried to clarify better:

   c.  In a BD where there are no AR-REPLICATORs due to the AR-

       REPLICATORs being down or reconfigured, the AR-LEAF MUST use

       regular Ingress Replication, based on the remote Regular-IR

       Inclusive Multicast Routes as described in [RFC7432].  This may

       happen in the following cases:



       o  The AR-LEAF has a list of AR-REPLICATORs for the BD, but it

          detects that all the AR-REPLICATORs for the BD are down (via

          next-hop tracking in the IGP or any other detection

          mechanism).



       o  The AR-LEAF receives updates from all the former AR-

          REPLICATORs containing a non-REPLICATOR AR type in the

          Inclusive Multicast Etherner Tag routes.



       o  The AR-LEAF never discovered an AR-REPLICATOR for the BD.



     d.  In a service where there is one or more AR-REPLICATORs (based on
         the received Replicator-AR routes for the BD), the AR-LEAF can
***************
*** 709,720 ****
         o  A single AR-REPLICATOR MAY be selected for all the BM packets
            received on the AR-LEAF attachment circuits (ACs) for a given
            BD.  This selection is a local decision and it does not have
!           to match other AR-LEAF's selection within the same BD.

         o  An AR-LEAF MAY select more than one AR-REPLICATOR and do
            either per-flow or per-BD load balancing.

!        o  In case of a failure on the selected AR-REPLICATOR, another
            AR-REPLICATOR will be selected.

         o  When an AR-REPLICATOR is selected, the AR-LEAF MUST send all
--- 819,830 ----
         o  A single AR-REPLICATOR MAY be selected for all the BM packets
            received on the AR-LEAF attachment circuits (ACs) for a given
            BD.  This selection is a local decision and it does not have
!           to match other AR-LEAFs' selections within the same BD.

         o  An AR-LEAF MAY select more than one AR-REPLICATOR and do
            either per-flow or per-BD load balancing.

!        o  In case of a failure of the selected AR-REPLICATOR, another
            AR-REPLICATOR will be selected.

         o  When an AR-REPLICATOR is selected, the AR-LEAF MUST send all
***************
*** 752,757 ****
--- 862,874 ----
         to the AR-REPLICATOR and be programmed.  While the AR-REPLICATOR-
         activation-time is running, the AR-LEAF node will use regular
         ingress replication.
+
+ Probably you should say something about the case where a router has
+ selected its preferred AR-REPLICATOR from the set that are available,
+ and then a new AR-REPLICATOR shows up that is more preferable.  Should
+ the router shift to the new, preferred replicator?  Should it stick
+ with the one it was already using even though less-preferred?  Is it a
+ matter of local policy?
[jorge] sure, I added:

   f.  If the AR-LEAF has selected an AR-REPLICATOR, it is a matter of

       local policy to change to a new preferred AR-REPLICATOR for the

       existing BM traffic flows.



     An AR-LEAF will follow a data path implementation compatible with the
     following rules:
***************
*** 849,874 ****
         REPLICATORs will fall back to non-selective AR mode.

     c.  The Selective AR-REPLICATOR MUST follow the procedures described
!        in section Section 5.1, except for the following differences:

         o  The Replicator-AR route MUST include L=1 (Leaf Information
            Required) in the Replicator-AR route.  This flag is used by
            the AR-REPLICATORs to advertise their 'selective' AR-
            REPLICATOR capabilities.  In addition, the AR-REPLICATOR auto-
            configures its IP-address-specific import route-target as
!           described in section Section 4.

         o  The AR-REPLICATOR will build a 'selective' AR-LEAF-set with
            the list of nodes that requested replication to its own AR-IP.
            For instance, assuming NVE1 and NVE2 advertise a Leaf A-D
            route with PE1's IP-address-specific route-target and NVE3
            advertises a Leaf A-D route with PE2's IP-address-specific
!           route-target, PE1 MUST only add NVE1/NVE2 to its selective AR-
!           LEAF-set for BD-1, and exclude NVE3.

!        o  When a node defined and operating as Selective AR-REPLICATOR
            receives a packet on an overlay tunnel, it will do a tunnel
!           destination IP lookup and if the destination IP is the AR-
            REPLICATOR AR-IP Address, the node MUST replicate the packet
            to:

--- 966,997 ----
         REPLICATORs will fall back to non-selective AR mode.

     c.  The Selective AR-REPLICATOR MUST follow the procedures described
!        in Section 5.1, except for the following differences:

         o  The Replicator-AR route MUST include L=1 (Leaf Information
            Required) in the Replicator-AR route.  This flag is used by
            the AR-REPLICATORs to advertise their 'selective' AR-
            REPLICATOR capabilities.  In addition, the AR-REPLICATOR auto-
            configures its IP-address-specific import route-target as
!           described in the third bullet of the procedures for Leaf A-D
!           route in Section 4.

         o  The AR-REPLICATOR will build a 'selective' AR-LEAF-set with
            the list of nodes that requested replication to its own AR-IP.
            For instance, assuming NVE1 and NVE2 advertise a Leaf A-D
            route with PE1's IP-address-specific route-target and NVE3
            advertises a Leaf A-D route with PE2's IP-address-specific
!           route-target, PE1 will only add NVE1/NVE2 to its selective AR-
!           LEAF-set for BD-1, and exclude NVE3.  Likewise, PE2 will only
!           add NVE3 to its selective AR-LEAF-set for BD-1, and exclude
!           NVE1/NVE2.
!
! I changed the MUST to "will" above -- it's an example, it's inappropriate
! to use RFC 2119 type keywords in it.

!        o  When a node defined and operating as a Selective AR-REPLICATOR
            receives a packet on an overlay tunnel, it will do a tunnel
!           destination IP lookup and if the destination IP address is the AR-
            REPLICATOR AR-IP Address, the node MUST replicate the packet
            to:

***************
*** 878,893 ****
               overlay tunnel to the source AR-LEAF).

            +  overlay tunnels to the RNVEs if the tunnel source IP is the
!              IR-IP of an AR-LEAF (in any other case, the AR-REPLICATOR
!              MUST NOT replicate the BM traffic to remote RNVEs).  In
               other words, only the first-hop selective AR-REPLICATOR
               will replicate to all the RNVEs.

            +  overlay tunnels to the remote Selective AR-REPLICATORs if
!              the tunnel source IP is an IR-IP of its own AR-LEAF-set (in
               any other case, the AR-REPLICATOR MUST NOT replicate the BM
!              traffic to remote AR-REPLICATORs), where the tunnel
!              destination IP is the AR-IP of the remote Selective AR-
               REPLICATOR.  The tunnel destination IP AR-IP will be an

--- 1001,1016 ----
               overlay tunnel to the source AR-LEAF).

            +  overlay tunnels to the RNVEs if the tunnel source IP is the
!              IR-IP of an AR-LEAF.  In any other case, the AR-REPLICATOR
!              MUST NOT replicate the BM traffic to remote RNVEs.  In
               other words, only the first-hop selective AR-REPLICATOR
               will replicate to all the RNVEs.

            +  overlay tunnels to the remote Selective AR-REPLICATORs if
!              the tunnel source IP address is an IR-IP of its own AR-LEAF-set.
 In
               any other case, the AR-REPLICATOR MUST NOT replicate the BM
!              traffic to remote AR-REPLICATORs.  When doing this replication,
the tunnel !              destination IP address is the AR-IP of the remote
Selective AR-
               REPLICATOR.  The tunnel destination IP AR-IP will be an

***************
*** 911,916 ****
--- 1034,1042 ----
            destination IP addresses.  Some of those overlay tunnels MAY
            be flagged as non-BM receivers based on the BM flag received
            from the remote nodes in the BD.
+
+ It's not clear to me why you'd include "overlay tunnels ... flagged as
+ non-BM receivers" in a flood-list that's used for flooding BM traffic?
[jorge] this refers to the pruned-flood-lists capability that can be signaled by the remote nodes. I added:
“Some of those overlay tunnels MAY be flagged as non-BM receivers based on the BM flag received from the remote nodes **in the Inclusive Multicast Ethernet Tag routes.**”


        2.  Flood-list #2 - composed of ACs, a Selective AR-LEAF-set and a
            Selective AR-REPLICATOR-set, where:
***************
*** 928,945 ****
     -  When a Selective AR-REPLICATOR receives a BM packet on an AC, it
        will forward the BM packet to its flood-list #1, skipping the non-
        BM overlay tunnels.

     -  When a Selective AR-REPLICATOR receives a BM packet on an overlay
        tunnel, it will check the destination and source IPs of the
        underlay IP header and:

!       o  If the destination IP matches its AR-IP and the source IP
           matches an IP of its own Selective AR-LEAF-set, the AR-
           REPLICATOR will forward the BM packet to its flood-list #2, as
           long as the list of AR-REPLICATORs for the BD matches the
           Selective AR-REPLICATOR-set.  If the Selective AR-REPLICATOR-
           set does not match the list of AR-REPLICATORs, the node reverts
           back to non-selective mode and flood-list #1 is used.

        o  If the destination IP matches its AR-IP and the source IP does
           not match any IP of its Selective AR-LEAF-set, the AR-
--- 1054,1104 ----
     -  When a Selective AR-REPLICATOR receives a BM packet on an AC, it
        will forward the BM packet to its flood-list #1, skipping the non-
        BM overlay tunnels.
+
+ It sure seems like it would have been cleaner to have expressed this by
+ naming a list (Flood-list #3, whatever) that doesn't include the non-BM
+ overlay tunnels to begin with, and then saying that's the list used in
+ this case. I guess this also relates to my previous comment/question --
+ basically, why are the non-BM overlay tunnels even included?
[jorge] it comes down to the specific implementation, but the concept is that the overlay tunnels are added to the flood list irrespective of the BM flags. Since the AR capability can be implemented without supporting the pruned-flood-lists capability (the spec makes both things independent) we decided to describe the text in this way, i.e., we add the overlay tunnels to the flood list, and if the implementation supports the BM flags, it will skip the ones with the flag if needed. I’d prefer to keep the text as it is…


     -  When a Selective AR-REPLICATOR receives a BM packet on an overlay
        tunnel, it will check the destination and source IPs of the
        underlay IP header and:

!       o  If the destination IP address matches its AR-IP and the source IP
address
           matches an IP of its own Selective AR-LEAF-set, the AR-
           REPLICATOR will forward the BM packet to its flood-list #2, as
           long as the list of AR-REPLICATORs for the BD matches the
           Selective AR-REPLICATOR-set.  If the Selective AR-REPLICATOR-
           set does not match the list of AR-REPLICATORs, the node reverts
           back to non-selective mode and flood-list #1 is used.
+
+ Presumably this time the non-BM overlay tunnels are NOT excluded?
[jorge] they can be potentially excluded.. I added a sentence.

+
+ Also, I guess the language above is where the answer to the "fall back to
+ non-selective AR mode" puzzle from point b, above, is hidden. It requires
+ that I make some assumptions:
+
+ - The "list of AR-REPLICATORS for the BD" is derived from the set of
+   AR-REPLICATOR advertisements for the BD. (This is not intuitively
+   obvious; "list" is very generic and could be, for example, configured
+   or something.)
+ - The Selective AR-REPLICATOR-set is all the members of the above list
+   that have advertised L=1.
+ - Ergo, if the sets aren't identical, some of them must have advertised
+   L=0.
+
+ It seems to me as though it would be more understandable to say something
+ like:
+
+ --
+       o  If the destination IP address matches its AR-IP and the source IP
address +          matches an IP of its own Selective AR-LEAF-set, the AR- +
      REPLICATOR will forward the BM packet to its flood-list #2, +
unless some AR-REPLICATOR within the BD has advertised L=0. +          In the
latter case, the node reverts +          back to non-selective mode and
flood-list #1 is used. + --
[jorge] good point. Changed it.


        o  If the destination IP matches its AR-IP and the source IP does
           not match any IP of its Selective AR-LEAF-set, the AR-
***************
*** 960,970 ****
           This is the regular-IR behavior described in [RFC7432].

     -  In any case, non-BM overlay tunnels are excluded from flood-lists
!       and, also, source squelching is always done in order to ensure the
        traffic is not sent back to the originating source.  If the
!       encapsulation is MPLSoGRE (or MPLSoUDP) and the BD label is not
        the bottom of the stack, the AR-REPLICATOR MUST copy the rest of
        the labels when forwarding them to the egress overlay tunnels.

  6.2.  Selective AR-LEAF procedures

--- 1119,1142 ----
           This is the regular-IR behavior described in [RFC7432].

     -  In any case, non-BM overlay tunnels are excluded from flood-lists
!
! That seems inconsistent with what point 1, above, says -- the place where
! I asked why you'd include non-BM receivers.  In any case, there you say
! they can be part of flood-list #1. Here you say they "are excluded".
! Which is it?
[jorge] fixed it in the new version. Now it should be consistent.

!
!       and, also,
        traffic is not sent back to the originating source.  If the
!       encapsulation is MPLSoGRE or MPLSoUDP and the BD label is not
        the bottom of the stack, the AR-REPLICATOR MUST copy the rest of
        the labels when forwarding them to the egress overlay tunnels.
+
+ Above, I removed "source squelching" again since it seemed not to add
+ anything, as previously.
+
+ Reference needed for "BD label".  I also wonder, is the requirement that
+ the replicator copy the rest of the labels a new one introduced here, or
+ are you just repeating an existing requirement from an underlying spec?
[jorge] it is a requirement in this spec for the AR-REPLICATOR. I clarified the sentence and added it to the non-selective AR-REPLICATOR procedures, since it was missing.
New text in the selective ar-replicator rules:

   A Selective AR-REPLICATOR data path implementation MUST be compatible

   with the following rules:



   -  The Selective AR-REPLICATORs will build two flood-lists:



      1.  Flood-list #1 - composed of Attachment Circuits and overlay

          tunnels to the remote nodes in the BD, always using the IR-IPs

          in the tunnel destination IP addresses.



      2.  Flood-list #2 - composed of Attachment Circuits, a Selective

          AR-LEAF-set and a Selective AR-REPLICATOR-set, where:



          +  The Selective AR-LEAF-set is composed of the overlay

             tunnels to the AR-LEAFs that advertise a Leaf Auto-

             Discovery route for the local AR-REPLICATOR.  This set is

             updated with every Leaf Auto-Discovery route received/

             withdrawn from a new AR-LEAF.



          +  The Selective AR-REPLICATOR-set is composed of the overlay

             tunnels to all the AR-REPLICATORs that send a Replicator-AR

             route with L=1.  The AR-IP addresses are used as tunnel

             destination IP.



   -  Some of the overlay tunnels in the flood-lists MAY be flagged as

      non-BM receivers based on the BM flag received from the remote

      nodes in the routes.



   -  When a Selective AR-REPLICATOR receives a BM packet on an

      Attachment Circuit, it MUST forward the BM packet to its flood-

      list #1, skipping the non-BM overlay tunnels.



   -  When a Selective AR-REPLICATOR receives a BM packet on an overlay

      tunnel, it will check the destination and source IPs of the

      underlay IP header and:



      o  If the destination IP address matches its AR-IP and the source

         IP address matches an IP of its own Selective AR-LEAF-set, the

         AR-REPLICATOR MUST forward the BM packet to its flood-list #2,

         unless some AR-REPLICATOR within the BD has advertised L=0.  In

         the latter case, the node reverts back to non-selective mode

         and flood-list #1 MUST be used.  Non-BM overlay tunnels are

         skipped when sending BM packets.



      o  If the destination IP address matches its AR-IP and the source

         IP address does not match any IP address of its Selective AR-

         LEAF-set, the AR-REPLICATOR MUST forward the BM packet to

         flood-list #2 but skipping the AR-REPLICATOR-set.  Non-BM

         overlay tunnels are skipped when sending BM packets.



      o  If the destination IP address matches its IR-IP, the AR-

         REPLICATOR MUST use flood-list #1 but MUST skip all the overlay

         tunnels from the flooding list, i.e. it will only replicate to

         local Attachment Circuits.  This is the regular-IR behavior

         described in [RFC7432].  Non-BM overlay tunnels are skipped

         when sending BM packets.



   -  In any case, the AR-REPLICATOR ensures the traffic is not sent

      back to the originating source.  If the encapsulation is MPLSoGRE

      or MPLSoUDP and the received BD label (or label that the AR-

      REPLICATOR advertised in the Replicator-AR route) is not the

      bottom of the stack, the AR-REPLICATOR MUST copy the rest of the

      labels when forwarding them to the egress overlay tunnels.



  6.2.  Selective AR-LEAF procedures

***************
*** 991,996 ****
--- 1163,1174 ----
     b.  The AR-LEAF MAY advertise a Regular-IR route if there are RNVEs
         in the BD.  The Selective AR-LEAF MUST advertise a Leaf A-D route
         after receiving a Replicator-AR route with L=1.  It is
+
+ "after receiving" -- so, does this mean it MUST NOT advertise a Leaf A-D
+ route prior to receiving any Replicator-AR route with L=1?  That would also
+ imply that if all Replicator-AR routes with L=1 are withdrawn, the Leaf A-D
+ route MUST be withdrawn?
[jorge] yes. I added a sentence to clarify that.

+
         RECOMMENDED that the Selective AR-LEAF waits for a AR-LEAF-join-
         wait-timer (in seconds, default value is 3) before sending the
         Leaf A-D route, so that the AR-LEAF can collect all the
***************
*** 998,1004 ****
         route.

     c.  In a service where there is more than one Selective AR-
!        REPLICATORs the Selective AR-LEAF MUST locally select a single
         Selective AR-REPLICATOR for the BD.  Once selected:

--- 1176,1182 ----
         route.

     c.  In a service where there is more than one Selective AR-
!        REPLICATOR the Selective AR-LEAF MUST locally select a single
         Selective AR-REPLICATOR for the BD.  Once selected:

***************
*** 1021,1026 ****
--- 1199,1211 ----
         o  In case of a failure on the selected AR-REPLICATOR, another
            AR-REPLICATOR will be selected and a new Leaf A-D update will
            be issued for the new AR-REPLICATOR.  This new route will
+
+ What does "in case of a failure on the selected AR-REPLICATOR" mean,
+ practically speaking?  How is this detected?  I presume the failure
+ is detected when the relevant route becomes infeasible as the result
+ of any of the relevant underlying BGP mechanisms (nexthop unresolvability,
+ holdtime expired, route withdrawal, etc).
[jorge] precisely. I added a sentence taking your words to clarify: “In case of a failure on the selected AR-REPLICATOR (detected when the Replicator-AR route becomes infeasible as the result of any of the underlying BGP mechanisms), …”

+
            update the selective list in the new Selective AR-REPLICATOR.
            In case of failure on the active Selective AR-REPLICATOR, it
            is RECOMMENDED for the Selective AR-LEAF to revert to IR
***************
*** 1030,1035 ****
--- 1215,1223 ----
            AR mode with the new Selective AR-REPLICATOR.  The AR-
            REPLICATOR-activation-timer MAY be the same configurable
            parameter as in Section 5.2.
+
+ What happens if a new AR-REPLICATOR is learned by the AR-LEAF, and the
+ new replicator is preferred over the currently-selected one?
[jorge] added:
“A Selective AR-LEAF MAY change the AR-REPLICATOR(s) selection dynamically, due to an administrative or policy configuration change.”


     All the AR-LEAFs in a BD are expected to be configured as either
     selective or non-selective.  A mix of selective and non-selective AR-
***************
*** 1045,1051 ****

        1.  Flood-list #1 - composed of ACs and the overlay tunnel to the
            selected AR-REPLICATOR (using the AR-IP as the tunnel
!           destination IP).

        2.  Flood-list #2 - composed of ACs and overlay tunnels to the
            remote IR-IP Addresses.
--- 1233,1239 ----

        1.  Flood-list #1 - composed of ACs and the overlay tunnel to the
            selected AR-REPLICATOR (using the AR-IP as the tunnel
!           destination IP address).

        2.  Flood-list #2 - composed of ACs and overlay tunnels to the
            remote IR-IP Addresses.
***************
*** 1054,1061 ****
        there is any selected AR-REPLICATOR.  If there is, flood-list #1
        will be used.  Otherwise, flood-list #2 will.

!    -  When an AR-LEAF receives a BM packet on an overlay tunnel, will
!       forward the BM packet to its local ACs and never to an overlay
        tunnel.  This is the regular IR behavior described in [RFC7432].

--- 1242,1249 ----
        there is any selected AR-REPLICATOR.  If there is, flood-list #1
        will be used.  Otherwise, flood-list #2 will.

!    -  When an AR-LEAF receives a BM packet on an overlay tunnel, it will
!       forward the packet to its local ACs and never to an overlay
        tunnel.  This is the regular IR behavior described in [RFC7432].

***************
*** 1071,1076 ****
--- 1259,1267 ----
     In addition to AR, the second optimization supported by this solution
     is the ability for the all the BD nodes to signal Pruned-Flood-Lists
     (PFL).  As described in section 3, an EVPN node can signal a given
+
+ I guess you meant Section 4?
+
     value for the BM and U PFL flags in the IR Inclusive Multicast
     Routes, where:

***************
*** 1085,1090 ****
--- 1276,1286 ----
     PFL flag and remove the sender from the corresponding flood-list.  A
     given BD node receiving BUM traffic on an overlay tunnel MUST
     replicate the traffic normally, regardless of the signaled PFL flags.
+
+ What exactly does "replicate the traffic normally" mean, in the context
+ of this specification?  I guess you should say something like "replicate
+ the traffic according to [reference]".  Also, I don't get it: what are the
+ flags FOR, if they're ignored when receiving on an overlay tunnel?
[jorge] I clarified the entire section with things that were missing/unclear. The sentence you referred to was indeed confusing. I replaced it with:
“An AR-LEAF or RNVE receiving BUM traffic on an overlay tunnel MUST replicate the traffic to its local Attachment Circuits, regardless of the BM/U flags on the overlay tunnels.”


     This optimization MAY be used along with the AR solution.

***************
*** 1123,1128 ****
--- 1319,1328 ----

         NVE2, but not to NVE3.  PE2 and NVE2 will replicate the BM
+
+ "but not to NVE3".  What happened to "MUST replicate the traffic normally"?
+ To me, these two pieces of text seem to contradict one another.
[jorge] hope the above change clarifies.

+
         packets to their local ACs but we will avoid NVE3 having to
         replicate unnecessarily those BM packets to VM31 and VM32.

***************
*** 1135,1147 ****
--- 1335,1357 ----
         NVE3 to NVE2, PE1 and PE2 but not NVE1.  The solution avoids the
         unnecessary replication to NVE1, since the destination of the
         unknown traffic cannot be at NVE1.
+
+ It's not clear to me why the destination can't be at NVE1.

     4.  Any Unknown unicast packet sent from TS1 will be forwarded by PE1
         to the WAN link, PE2 and NVE2 but not to NVE1 and NVE3, since the
         target of the unknown traffic cannot be at those NVEs.
+
+ Similarly, I don't get why this is the case.
[jorge] I added this, does it help?:

That is, neither NVE1 nor NVE3 are

      interested in receiving BM or Unknown Unicast traffic since:



      o  Their attached VMs (VM11, VM12, VM31, VM32) do not support

         multicast applications.



      o  Their attached VMs will not receive ARP Requests.  Proxy-ARP

         [I-D.ietf-bess-evpn-proxy-arp-nd] on the remote NVE/PEs will

         reply ARP Requests locally, and no other Broadcast is expected.



      o  Their attached VMs will not receive unknown unicast traffic,

         since the VMs' MAC and IP addresses are always advertised by

         EVPN as long as the VMs are active.



  8.  AR Procedures for single-IP AR-REPLICATORS

+ I'm curious why the design choice was made to specify two different ways to
+ do the same thing.  You motivate why not all routers can use distinguished
+ IP addresses for the two different functional modes; however, presumably all
+ routers could make use of distinguished VNIs as you do here.  I'd appreciate
+ a few words about why you didn't choose to just always use the VNI approach.
+
[jorge] actually we found the two cases:

  *   typically VNIs are global per VNI, so the dual IP solution works better with some merchant silicon.
  *   However we found some cases where supporting more than one loopback IP on the NVE was not trivial (actually the request came from one of the vendors with authors/contributors in the document – not mine 😊)
That’s why we decided to include both. The implementations I know of (and were tested in public interop events) use different IP addresses and same VNI.

     The procedures explained in sections Section 5 and Section 6 assume
     that the AR-REPLICATOR can use two local routable IP addresses to
     terminate and originate NVO tunnels, i.e. IR-IP and AR-IP addresses.
***************
*** 1184,1201 ****
  9.  AR Procedures and EVPN All-Active Multi-homing Split-Horizon

     This section extends the procedures for the cases where AR-LEAF nodes
!    or AR-REPLICATOR nodes are attached to the the same Ethernet Segment
     in the BD.  The case where one (or more) AR-LEAF node(s) and one (or
     more) AR-REPLICATOR node(s) are attached to the same Ethernet Segment
     is out of scope.

  9.1.  Ethernet Segments on AR-LEAF nodes

     If VXLAN or NVGRE are used, and if the Split-horizon is based on the
     tunnel IP SA and "Local-Bias" as described in [RFC8365], the Split-
     horizon check will not work if there is an Ethernet-Segment shared
!    between two AR-LEAF nodes, and the AR-REPLICATOR changes the tunnel
     IP SA of the packets with its own AR-IP.

     In order to be compatible with the IP SA split-horizon check, the AR-
     REPLICATOR MAY keep the original received tunnel IP SA when
--- 1394,1418 ----
  9.  AR Procedures and EVPN All-Active Multi-homing Split-Horizon

     This section extends the procedures for the cases where AR-LEAF nodes
!    or AR-REPLICATOR nodes are attached to the same Ethernet Segment
     in the BD.  The case where one (or more) AR-LEAF node(s) and one (or
     more) AR-REPLICATOR node(s) are attached to the same Ethernet Segment
     is out of scope.
+
+ I just can't understand what this paragraph is telling me. :-(  Apart from
+ anything else, to the casual reader the second sentence seems to contradict
+ the first.
[jorge] the sentence is indeed unfortunate, changed it to:
“This section extends the procedures for the cases where two or more AR-LEAF nodes are attached to the same Ethernet Segment, and two or more AR-REPLICATOR nodes are attached to the same Ethernet Segment in the BD. The mixed case, that is, an AR-LEAF node and an AR-REPLICATOR node are attached to the same Ethernet Segment, is out of scope.”


  9.1.  Ethernet Segments on AR-LEAF nodes

     If VXLAN or NVGRE are used, and if the Split-horizon is based on the
     tunnel IP SA and "Local-Bias" as described in [RFC8365], the Split-
     horizon check will not work if there is an Ethernet-Segment shared
!    between two AR-LEAF nodes, and the AR-REPLICATOR replaces the tunnel
     IP SA of the packets with its own AR-IP.
+
+ I changed "changes" to "replaces"; it's my best guess as to what you meant.
+ If that's wrong, please help me understand what you did mean.

     In order to be compatible with the IP SA split-horizon check, the AR-
     REPLICATOR MAY keep the original received tunnel IP SA when
***************
*** 1203,1209 ****
     LEAF nodes to apply Split-horizon check procedures for BM packets,
     before sending them to the local Ethernet-Segment.  Even if the AR-
     LEAF's IP SA is preserved when replicating to AR-LEAFs or RNVEs, the
!    AR-REPLICATOR MUST always use its IR-IP as IP SA when replicating to
     other AR-REPLICATORs.

     When EVPN is used for MPLS over GRE (or UDP), the ESI-label based
--- 1420,1426 ----
     LEAF nodes to apply Split-horizon check procedures for BM packets,
     before sending them to the local Ethernet-Segment.  Even if the AR-
     LEAF's IP SA is preserved when replicating to AR-LEAFs or RNVEs, the
!    AR-REPLICATOR MUST always use its IR-IP as the IP SA when replicating to
     other AR-REPLICATORs.

     When EVPN is used for MPLS over GRE (or UDP), the ESI-label based
***************
*** 1220,1226 ****

  9.2.  Ethernet Segments on AR-REPLICATOR nodes

!    Ethernet Segments associated to one or more AR-REPLICATOR nodes
     SHOULD follow "Local-Bias" procedures for EVPN all-active multi-
     homing, as follows:

--- 1437,1443 ----

  9.2.  Ethernet Segments on AR-REPLICATOR nodes

!    Ethernet Segments associated with one or more AR-REPLICATOR nodes
     SHOULD follow "Local-Bias" procedures for EVPN all-active multi-
     homing, as follows:

***************
*** 1240,1245 ****
--- 1457,1464 ----
        it had been received on a local AC that is part of the ES and will
        be forwarded to all local ES, irrespective of their DF or NDF
        state.
+
+ Please define/expand "ES".

     -  BUM traffic received on an AR-REPLICATOR overlay tunnel with IR-IP
        as the IP DA, will follow regular [RFC8365] "Local-Bias" rules and
***************
*** 1254,1259 ****
--- 1473,1483 ----
     In addition, the procedures introduced by this document may bring
     some new risks for the successful delivery of BM traffic.  Unicast
     traffic is not affected by this document.  The forwarding of
+
+ If unicast traffic isn't affected, what's the U flag even for?  It sure
+ seems as though it's intended to affect the forwarding of (unknown)
+ unicast traffic.
[jorge] good point, changed it to:
“In addition, the Assisted-Replication method introduced by this document may bring some new risks for the successful delivery of BM traffic. Unicast traffic is not affected by Assisted-Replication (although Unknown unicast traffic is affected by the Pruned-Flood-Lists procedures). The forwarding of Broadcast and Multicast (BM) traffic is modified; and BM traffic from the AR-LEAF nodes will be attracted by the existence of AR-REPLICATORs in the BD. An AR-LEAF will forward BM traffic to its selected AR-REPLICATOR, therefore an attack on the AR-REPLICATOR could impact the delivery of the BM traffic using that node.”

+
     Broadcast and Multicast (BM) traffic is modified though, and BM
     traffic from the AR-LEAF nodes will be attracted by the existance of
     AR-REPLICATORs in the BD.  An AR-LEAF will forward BM traffic to its
***************
*** 1262,1270 ****

     An implementation following the procedures in this document should
     not create BM loops, since the AR-REPLICATOR will always forward the
     BM traffic using the correct tunnel IP Destination Address that
     indicates the remote nodes how to forward the traffic.  This is true
!    in both, the Non-Selective and Selective modes defined in this
     document.

     The Selective mode provides a multi-staged replication solution,
--- 1486,1503 ----

     An implementation following the procedures in this document should
     not create BM loops, since the AR-REPLICATOR will always forward the
+
+ Instead of "should not create BM loops" I suggest "will not create" or
+ if you can't actually promise that, "is not expected to create".  I assume
+ you're using "should" in the sense of weak expectation, and not like a
+ RFC 2119 SHOULD.
+
[jorge] changed the paragraph based on previous feedback:
“This document introduces the ability for the AR-REPLICATOR to forward traffic received on an overlay tunnel to another overlay tunnel. The reader may interpret that this introduces the risk of BM loops. That is, an AR-LEAF receiving a BM encapsulated packet that the AR-LEAF originated in the first place, due to one or two AR-REPLICATORs "looping" the BM traffic back to the AR-LEAF. The procedures in this document prevent these BM loops, since the AR-REPLICATOR will always forward the BM traffic using the correct tunnel IP Destination Address that instructs the remote nodes how to forward the traffic. This is true in both the Non-Selective and Selective modes defined in this document. However, a wrong implementation of the procedures in this document may lead to those unexpected BM loops.”

     BM traffic using the correct tunnel IP Destination Address that
     indicates the remote nodes how to forward the traffic.  This is true
!
! Instead of "indicates", try instructs, cues, or directs?
!
!    in both the Non-Selective and Selective modes defined in this
     document.

     The Selective mode provides a multi-staged replication solution,