[Bier] Benjamin Kaduk's Discuss on draft-ietf-bier-te-arch-10: (with DISCUSS and COMMENT)

Benjamin Kaduk via Datatracker <noreply@ietf.org> Thu, 26 August 2021 06:59 UTC

Return-Path: <noreply@ietf.org>
X-Original-To: bier@ietf.org
Delivered-To: bier@ietfa.amsl.com
Received: from ietfa.amsl.com (localhost [IPv6:::1]) by ietfa.amsl.com (Postfix) with ESMTP id BFD8E3A109C; Wed, 25 Aug 2021 23:59:59 -0700 (PDT)
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 8bit
From: Benjamin Kaduk via Datatracker <noreply@ietf.org>
To: The IESG <iesg@ietf.org>
Cc: draft-ietf-bier-te-arch@ietf.org, bier-chairs@ietf.org, bier@ietf.org, Xuesong Geng <gengxuesong@huawei.com>, aretana.ietf@gmail.com, gengxuesong@huawei.com
X-Test-IDTracker: no
X-IETF-IDTracker: 7.36.0
Auto-Submitted: auto-generated
Precedence: bulk
Reply-To: Benjamin Kaduk <kaduk@mit.edu>
Message-ID: <162996119973.19003.11174569450531796563@ietfa.amsl.com>
Date: Wed, 25 Aug 2021 23:59:59 -0700
Archived-At: <https://mailarchive.ietf.org/arch/msg/bier/37Vwp4gJlI_jGtzB2DUjNCQRqcU>
Subject: [Bier] Benjamin Kaduk's Discuss on draft-ietf-bier-te-arch-10: (with DISCUSS and COMMENT)
X-BeenThere: bier@ietf.org
X-Mailman-Version: 2.1.29
List-Id: "\"Bit Indexed Explicit Replication discussion list\"" <bier.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bier>, <mailto:bier-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bier/>
List-Post: <mailto:bier@ietf.org>
List-Help: <mailto:bier-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bier>, <mailto:bier-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 26 Aug 2021 07:00:00 -0000

Benjamin Kaduk has entered the following ballot position for
draft-ietf-bier-te-arch-10: Discuss

When responding, please keep the subject line intact and reply to all
email addresses included in the To and CC lines. (Feel free to cut this
introductory paragraph, however.)


Please refer to https://www.ietf.org/iesg/statement/discuss-criteria.html
for more information about DISCUSS and COMMENT positions.


The document, along with other ballot positions, can be found here:
https://datatracker.ietf.org/doc/draft-ietf-bier-te-arch/



----------------------------------------------------------------------
DISCUSS:
----------------------------------------------------------------------

I think there's an issue with the pseudocode in Figure 6.  While I
understand that it's pseudocode, any reasonable interpretation I can
come up with for the "~" and "&=" operators seems to result in
performing an operation logically equivalent to:

X = AdjacentBits[SI]
Packet->BitString = Packet->BitString & ~X & X

that can be optimized to

Packet->BitString = 0

and I do not think that the only bits that are supposed to be set in the
outgoing packet/packet copy are the ones for which DNC is set -- bits
that we did not find in our BIFT should remain set in outgoing packets.
(Slightly more detail in the COMMENT.)


----------------------------------------------------------------------
COMMENT:
----------------------------------------------------------------------

I think it would be helpful to have a clear statement early on that the
DNC bit is applicable only to "forward_connected()" adjacencies and not
to "forward_routed()", that is, earlier than the guidance in §5.2.1.

It's easy for a reader in a hurry to miss the distinction that just
"BIER" is "regular BIER" as opposed to the "BIER-TE" that this document
specifies.  It might be worth adding a qualifier in at least some
instances about "non-TE BIER" or similar, especially where we have a
paragraph that talks about BIER as a set-up for a comparison with
BIER-TE.

Thanks for the conversation with the gen-art reviewer; I think it's important
to get in the clarifications that were proposed in that thread.

Section 2.3

   BIER-TE is also intended as an option to expand the BIER architecture
   into deployments where BIER may not be the best fit, [...]

It's fairly surprising to advocate for doing something even when there
are known better options to achieve the same goal.

Section 4.3

It seems that these coexistence schemes rely on something like a central
controller entity to configure all BFRs with the criteria for assessing
whether a given packet is in "BIER mode" or "BIER-TE mode".  It seems
prudent to consider the risk that this information is not distributed
uniformly in time (especially in the case of dynamic updates) or
otherwise becomes stale on some particular node(s).  Would the network
be "self-healing" in the face of a node that persistently
mis-forwards/replicates a given class of packet (by virtue of treating
it as BIER rather than BIER-TE, or vice versa)?

Section 4.4

        AdjacentBits = Packet->BitString &= ~AdjacentBits[SI];

The "A = B &= C" construct might be overly clever for what is intended
to be illustrative pseudocode...

        Packet->BitString &= AdjacentBits[SI];

... and the combination of "Packet->BitString &= ~X" and
"Packet->BitString &= X" seems to result in all bits getting cleared, if
I'm not mistaken.  Surely that's not the intent (per the Discuss)...

Section 4.5

I think the example that walks through the packet flow and talks about
which bits are cleared along the way is a helpful example.

Section 5.1.3

   All leaf-BFERs in a BIER-TE domain can share a single bit position.
   This is possible because the bit position for the adjacency to reach
   the BFER can be used to distinguish whether or not packets should
   reach the BFER.

I'm not sure I understand this properly.  I think that this single
shared bit is used both to indicate "send along this adjacency" and
(implicitly?) "local_decap when it gets here".  It seems like this could
result in spurious packet copies egressing from some BFERs.  Consider
(using the left topology from Figure 10) a case where only egress from
BFER2 is needed, but some other constraints meant that the only possible
path from BFIR to BFER2 involved traversing BFR1 on the way to BFR2.
Even though BFR1 only needs to be a transit router for this flow, it
will still send a packet copy to BFER1 (where it gets decapsulated and
sent out).  Even worse, if DNC is not set, it seems that the copy that
leaves BFR1 and makes its way to BFR2 will have the bit cleared that's
used to indicate "leaf BFER".  So, either I'm confused about how this
works (something that's quite plausible!), or there's some other
undocumented constraints here either on the topology or the use of DNC.

Reading forward to the summary in §5.1.10, perhaps my assumption about
indicating "send along this adjacency" was incorrect.  Still, I
shouldn't need to read the later section in order to understand the
earlier one, so adding a few more words here may be in order.

Section 5.1.5

   In a setup with a hub and multiple spokes connected via separate p2p
   links to the hub, all p2p links can share the same bit position.  The

Is this for the adjacency to the hub, from the hub, or the same bit for
both?

Section 5.2.1

   Link layers without port unique link layer addresses should not be
   used with the DNC flag set.

"should not" or "MUST NOT"?

Section 5.2.2

   When links are incorrectly physically re-connected before the BIER-TE
   Controller updates BitStrings in BFIRs, duplicates can happen.  Like
   loops, these can be inhibited by link layer addressing in
   forward_connected() adjacencies.

(Likewise.)

Section 5.3.2

   and the amount of alternative paths in it.  The higher the percentage
   of non-BFER bits, the higher the likelihood, that those topology bits
   are not just BIER-TE overhead without additional benefit, but instead
   that they will allow to express desirable path steering alternatives.

This assessment of likelihood seems unsupportable without some
additional assumptions or preconditions.  If the sampling domain is
"all possible representations of subtopologies" I think the statement is
false.  It only seems likely to hold if the sampling domain is limited
to topology descriptions crafted by BIER-TE experts.

Section 5.3.3

   In BIER-TE, for a non-leaf BFER, there is usually a single BP for
   that BFER with a local_decap() adjacency on the BFER.  The BFR-id for
   such a BFER can therefore equally be determined as in BIER: BFR-id =
   SI * BitStringLength + BP.

I don't think I understand why this constraint is necessary.  What
component is going to know which BFRs are BFERs and rely on this
property of teh BFR-id?

Section 5.3.5

   It is not currently determined if a single sub-domain could or should
   be allowed to forward both BIER and BIER-TE packets.  If this should
   be supported, there are two options:

This sounds kind of Experimental.  Perhaps this content is more
appropriate in an Appendix?

   this approach.  Depending on topology, only the same 20%..80% of bits
   as possible for BIER-TE can be used for BIER.

Where does the 20-80% range come from?

Section 5.3.6.1

   Consider a network setup with a BSL of 256 for a network topology as
   shown in Figure 20.  The network has 6 areas, each with 170 BFRs,
   connecting via a core with 4 (core) BFRs.  To address all BFERs with
   BIER, 4 SIs are required.  To send a BIER packet to all BFER in the

How many BFERs per area?
What calculation produces the "4 SIs are required" result?

Section 5.3.6.2

   On all BFIRs in an area j|j=2...6, bia in each BIFT:SI is populated
   with the same forward_routed(BFRja), and bib with
   forward_routed(BFRjb).  On all area edge BFR, bea in
   BIFT:SI=k|k=2...6 is populated with forward_routed(BFRka) and beb in

Why do j and k range from 2 through 6, excluding 1?  Where is area 1
handled?

Section 6

   Adjacency scope could be global, but then every BFR would need an
   adjacency for this BP, for example a forward_routed() adjacency with
   encapsulation to the global SR SID of the destination.  Such a BP
   would always result in ingress replication though.  The first BFR

I don't think I know what "ingress replication" is.
Are we trying to say that all the replication that occurs for traffic to
this BP is created starting at the ingress node and there can be no
efficiency gains by deferring replication until later on in the path(s)?

   Both BIER and BIER-TE allow BFIR to "opportunistically" copy packets
   to a set of desired BFER on a packet-by-packet basis.  In BIER, this
   is done by OR'ing the BP for the desired BFER.  In BIER-TE this can
   be done by OR'ing for each desired BFER a BitString using the
   "independent branches" approach described in Section 5.3.3 and
   therefore also indicating the engineered path towards each desired
   BFER.  This is the approach that
   [I-D.ietf-bier-multicast-http-response] relies on.

It's a bit late in the day here (so I could be missing something), but
it seems to me that this paragraph is not really needed in this
document.  If it's helpful content, it could appear in the referenced
document, to which it seems more directly applicable.

Section 7

It would be great if (as Roman mentions) we could say that any
BFR/Controller connections used for BIER-TE MUST be protected by TLS or
security protections at least as strong as TLS.

We might consider something like "the discussion in [3272bis] is also
relevant for BIER-TE".

Section 11.2

If "it is expected that the reader be familiar with" RFC 8296, it should
probably be classified as normative.

NITS

By no means an exhaustive collection, but hopefully they will help.

Section 2.2

   may be called an "overlay" BIER-TE topology.  A BIER-TE topology will
   both "forward_connected" and "forward_routed" adjacencies may be
   called a "hybrid" BIER-TE topology.

s/will/with/

Section 2.3

                                                                In BIER-
           TE, bits in the BitString of a BIER packet header indicate an
           adjacency in the BIER-TE topology, and only the BFRs that are
           upstream of this adjacency have this bit populated with the
           adjacency in their BIFT.

I think that s/are upstream/are the upstream endpoint/ would greatly aid
readability.

       2.  In BIER, the implied reference option for the core part of
           the BIER layer control plane is the BIER extension to
           distributed routing protocol, such as standardized in ISIS/
           OSPF extensions for BIER, [RFC8401] and [RFC8444].  [...]

There seems to be some singular/plural mismatch(es) going on here.
Maybe "are the BIER extensions to distributed routing protocols, such as
those standardized in"?

   3.  The following element/functions described in the BIER
       architecture are not required by the BIER-TE architecture:

Probably best to use "elements/functions" to match plurality.

       1.  BIRTs on BFR for BIER-TE are not required when using a BIER-

I think "BFRs" plural.  (Probably the plural is best for the next few
instances of "BFR" (not "BFR-id") in this section, but I will not try to
note them all explicitly.)

Section 3.2

I think s/BFIR/BFIRs/ in all cases in this section (excluding
subsections) other than item 2.3, and possibly 2.1 if the intent is "for
each multicast overlay flow".

       3.  Install state on the BFIR to imposition the desired BIER
           packet header(s) for packets of the overlay flow.

s/imposition/impose/

Section 3.2.1

   o  Data-models and protocols to communicate between controller and
      BFR in step 1, such YANG/Netconf/RestConf.

probably s/BFR/BFRs/ and s/such/such as/.

Section 3.2.1.1

   When the topology is determined, the BIER-TE Controller then pushes
   the BitPositions/adjacencies to the BIFT of the BFRs, populating only
   those SI:BitPositions to the BIFT of each BFR to which that BFR
   should be able to send packets to - adjacencies connecting to this
   BFR.

This sentence should probably be rewritten and split up; it may have
multiple valid parse trees but regardless is pretty hard to parse.
Having separate terms for "the BFR receiving BIFT updates" and "the BFR
to which a given bit says to send a packet" would help a lot.

Section 3.2.1.2

   encoded as the same BitString.  In BIER-TE, the BitString used to
   reach the same set of BFER in the same sub-domain can be different
   for different overlay flows because the BitString encodes the paths
   towards the BFER, so the BitStrings from different BFIR to the same
   set of BFER will often be different, and the BitString from the same
   BFIR to the same set of BFER can different for different overlay
   flows for policy reasons such as shortest path trees, Steiner trees
   (minimum cost trees), diverse path trees for redundancy and so on.

I suggest breaking the sentence before "and the BitString from the same
BFIR".

Also, s/can different/can be different/

   See also [I-D.ietf-bier-multicast-http-response] for a solution
   describing this interaction.

I'm not sure that "solution" is the best word here ("solution to which
problem?").

Section 3.2.1.4

   When link or nodes fail or recover in the topology, BIER-TE could
   quickly respond with out-of-scope FRR procedures such as
   [I-D.eckert-bier-te-frr].  [...]

Is the intent to say "could quickly respond with FRR procedures [...]
the details of which are out of scope for this document"?

Section 3.3

   1.  On BFIR imposition of BIER header for packets from overlay flows.

comma after BFIR

   3.  On BFER removal of BIER header and dispatching of the payload

comma after BFER

Section 3.5

   available at each BFR and for each BP when it is providing congestion
   loss free services such as Rate Controlled Service Disciplines

hyphenate "congestion-loss-free"

   control protocol such as Netconf or any other protocol commonly used
   by a PCE to understand the resources of the network it operates on.

I think s/PCE/Controller/ since this is the only instance of "PCE"
that's not part of "PCEP", in this document.

   the BIER-TE defined steering of packets.  This includes allocation of
   buffers to guarantee the worst case requirements of admitted RCSD
   traffic and potential policing and/or rate-shaping mechanisms,

I'm not sure, but possibly s/potential/potentially/

Section 4.2.1

   for copies of the packet made towards other adjacencies.  This can be
   used for example in ring topologies as explained below.

"Below" could become a reference to §5.1.6 as was done in §4.1.

Section 4.2.3

   packet is copied onto such an ECMP adjacency, an implementation
   specific so-called hash function will select one out of the lists
   adjacencies to which the packet is forwarded.  This ECMP hash

s/lists/list's/

Section 4.2.4

   packets.  Local_decap() adjacencies require the BFER to support
   routing or switching for NextProto to determine how to further
   process the packet.

I'm not finding a protocol element to map to "NextProto" in this
context, so maybe writing out in long form something like "the next
protocol layer" would be preferred.

Section 4.3

   With MPLS, it is also possible to reuse the same SD space for both
   BIER-TE and BIER, so that the same SD has both a BIER BIFT and
   according range of BIFT-ids and a disjoint BIER-TE BIFT and non-
   overlapping range of BIFT-ids.

I think s/according/corresponding/

   "forward_routed" requires an encapsulation permitting to unicast
   BIER-TE packets to a specific interface address on a target BFR.

"encapsulation that permits directing unicast BIER-TE packets"

Section 4.4

      void ForwardBitMaskPacket_withTE (Packet)
      {
          SI=GetPacketSI(Packet);
          Offset=SI*BitStringLength;
          for (Index = GetFirstbit position(Packet->BitString); Index ;
               Index = GetNextbit position(Packet->BitString, Index)) {
              F-BM = BIFT[Index+Offset]->F-BM;
              if (!F-BM) continue;                            [3]
              BFR-NBR = BIFT[Index+Offset]->BFR-NBR;
              PacketCopy = Copy(Packet);
              PacketCopy->BitString &= F-BM;                  [2]
              PacketSend(PacketCopy, BFR-NBR);
              // The following must not be done for BIER-TE:
              // Packet->BitString &= ~F-BM;                  [1]
          }
      }

GetFirstBitPosition and GetNextBitPosition are the spellings used in the
RFC 8279 pseudocode; I don't see reason to diverge from them.  (The same
comment applies to the pseudocode in Figure 6 as well.)

   To solve both [1] and [2] for BIER, the F-BM of each bit needs to
   have all bits set that this BFR wants to route across BFR-NBR. [2]

Expanded out, "the F-BM of each bit needs to have" doesn't seem to make
much sense; would s/bit/bit index/ make more sense?

      *  The packets BitString is masked with those AdjacentBits before
         the loop to avoid doing this repeatedly for every PacketCopy.

s/packets/packet's/

                    case forward_routed({VRF},neighbor):
                        SendToL3(PacketCopy,{VRF,}l3-neighbor);

Should the first line use "l3-neighbor" as well as the second one?

Section 4.5

   BFR3 sees a BitString of p5,p7,p8,p10,p11,p12.  For those BPs, it has
   only adjacencies for p7,p8.  It creates a copy of the packet to BFER1
   (due to p7) and one to BFR4 (due to p8).  It clears p7, p8 before
   sending.

I think "clears both p7 and p8 in both copies before sending" is more
clear.

Section 4.6

   clear DNC flag, forward_routed() and local_decap.  As explained in
   Section 4.4, these REQUIRED BIER-TE forwarding functions can be
   implement via the same Forwarding Pseudocode as BIER forwarding

s/implement/implemented/

Section 5.1.4

   opposite LANs.  Adjacencies of such BFRs into their LAN still need a
   separate bit position.

Is this something like "such multi-homed BFRs"?

Section 5.1.9

                          Figure 17: Reuse of BP

   Reuse may also save BPs in larger topologies.  Consider the topology
   shown in Figure 20.  A BFIR/sender (e.g.: video headend) is attached

I suspect that s/20/17/ is intended.

Section 5.1.10

   o  BP can generally be reused across nodes that do not need to be
      consecutive in paths, but depending on scenario, this may limit

"consecutive in paths" might imply "directly one after the other" to
some readers.  Perhaps "where it can be guaranteed that no path will
ever need to traverse more than one node of the set"?

Section 5.3.3

   BIER-TE forwarding does not use the BFR-id, not does it require for

s/not does/nor does/

Section 5.3.6.2

   In addition, we use 4 bits in each SI: bia, bib, bea, beb: bit
   ingress a, bit ingress b, bit egress a, bit egress b.  These bits

It's probably worth spending a few more words to connect these "a" and
"b" back to the 6 different BFR1a/BFR1b through BFR6a/BFR6b of Figure
20.  (Figure 20, which appears in §5.3.6.1, is not mentioned at all from
this section at present.)

   legs: 1) BFIR to ingress are edge and 2) core to egress area edge.

s/are edge/area edge/

Section 6

   hop to each destination Node-SID.  What BIER does not allow is to
   indicate intermediate hops, or terms of SR the ability to indicate a

s/terms of SR/in terms of SR/

Section 7

   In BIER, the standardized methods for the routing underlays as well
   as to distribute BFR-ids and BFR-prefixes are IGPs such as specified
   in [RFC8401] for IS-IS and in [RFC8444] for OSPF.  Attacking the

I can't make a parse tree for this sentence.  I think it's just trying
to say that RFCs 8401 and 8444 specify how to distribute BFR-ids and
BFR-prefixes over the respective IGP, and the respective IGP inherently
distributes information needed for routing underlays, but I could be
wrong.

   against the results of the routing protocol, enabling DoS attacks
   against paths or addressing (BFR-id, BFR-prefixes) used by BIER.

I think s/addressing/the addressing/

   adjacencies, then this is still an attack vector as in BIER, but only
   for BIER-TE forward_routed() adjacencies, and no other adjacencies.

s/and no/and not/

   the BIER-TE controller/control-plane can and are much more commonly

s/can/can be/

   forwarding rules are defined to be as strict in clearing bits as they
   are.  The clearing of all bits with an adjacency on a BFR prohibits

I think just "defined to be as strict in clearing bits as possible" --
"defined to be as <anything> as they are" is basically a tautology.

   that a looping packet creates additional packet amplification through
   the misconfigured loop on the packets second or further times around

s/packets/packet's/

   Deployments especially where BIER-TE would likely be beneficial may
   include operational models where actual configuration changes from

This "especially" seems out of place (I think the sentence would work if
it's just dropped, though maybe moving it is preferred).

   the controller are only required during non-productive phases of the

Maybe s/productive/production/?

   networks life-cycle, such as in embedded networks or in manufacturing

s/networks/network's/ (just the first one)

   reverting the network/installation into non-productive state.  Such

("production" again)