Re: [Bier] WGLC - draft-ietf-bier-te-arch

Toerless Eckert <tte@cs.fau.de> Mon, 28 October 2019 17:24 UTC

Return-Path: <eckert@i4.informatik.uni-erlangen.de>
X-Original-To: bier@ietfa.amsl.com
Delivered-To: bier@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 32F7212095C for <bier@ietfa.amsl.com>; Mon, 28 Oct 2019 10:24:45 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -3.17
X-Spam-Level:
X-Spam-Status: No, score=-3.17 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, HEADER_FROM_DIFFERENT_DOMAINS=0.25, RCVD_IN_DNSWL_MED=-2.3, SPF_HELO_NONE=0.001, SPF_NEUTRAL=0.779] autolearn=ham autolearn_force=no
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id yknywQDib61y for <bier@ietfa.amsl.com>; Mon, 28 Oct 2019 10:24:40 -0700 (PDT)
Received: from faui40.informatik.uni-erlangen.de (faui40.informatik.uni-erlangen.de [IPv6:2001:638:a000:4134::ffff:40]) (using TLSv1.2 with cipher ADH-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id AAFFA120960 for <bier@ietf.org>; Mon, 28 Oct 2019 10:24:39 -0700 (PDT)
Received: from faui48f.informatik.uni-erlangen.de (faui48f.informatik.uni-erlangen.de [131.188.34.52]) by faui40.informatik.uni-erlangen.de (Postfix) with ESMTP id C0170548005; Mon, 28 Oct 2019 18:24:33 +0100 (CET)
Received: by faui48f.informatik.uni-erlangen.de (Postfix, from userid 10463) id BBD0D440015; Mon, 28 Oct 2019 18:24:33 +0100 (CET)
Date: Mon, 28 Oct 2019 18:24:33 +0100
From: Toerless Eckert <tte@cs.fau.de>
To: "Jeffrey (Zhaohui) Zhang" <zzhang@juniper.net>
Cc: "gjshep@gmail.com" <gjshep@gmail.com>, Xiejingrong <xiejingrong@huawei.com>, BIER WG <bier@ietf.org>, Mike McBride <mmcbride7@gmail.com>, "Pascal Thubert (pthubert)" <pthubert@cisco.com>
Message-ID: <20191028172433.GD24806@faui48f.informatik.uni-erlangen.de>
References: <CABFReBpA6PJMDw3RC+NHqVUQxy_-W14R-=gTb-YMKQauELA0_g@mail.gmail.com> <20190604000302.xccdl5jknh7ols23@faui48f.informatik.uni-erlangen.de> <MN2PR11MB3565E3AE3803A9C5EA1C0646D8150@MN2PR11MB3565.namprd11.prod.outlook.com> <CABFReBqpi06wc3Exp2ekUGTbHDdi8zvL1qJDOJ7-Wvd=nZanLw@mail.gmail.com> <CAL3FGfwF=q3mOcWW4ymo5zY-DgFUD=Gguh+0A0yQ17O8Gsv+gA@mail.gmail.com> <20190709153828.nplogcxymui5anmq@faui48f.informatik.uni-erlangen.de> <16253F7987E4F346823E305D08F9115AAB8F2F86@nkgeml514-mbs.china.huawei.com> <CABFReBpJ+zD2B46kq2AKvj3CS9hCpMq_epHHde9zYZW0UGQjvQ@mail.gmail.com> <DM5PR05MB3548E388EA3A5880E4A19858D4660@DM5PR05MB3548.namprd05.prod.outlook.com>
MIME-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <DM5PR05MB3548E388EA3A5880E4A19858D4660@DM5PR05MB3548.namprd05.prod.outlook.com>
User-Agent: Mutt/1.10.1 (2018-07-13)
Archived-At: <https://mailarchive.ietf.org/arch/msg/bier/iJpiHHZ-jOQ6c0ZiXk0x2_exK04>
Subject: Re: [Bier] WGLC - draft-ietf-bier-te-arch
X-BeenThere: bier@ietf.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: "\"Bit Indexed Explicit Replication discussion list\"" <bier.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bier>, <mailto:bier-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bier/>
List-Post: <mailto:bier@ietf.org>
List-Help: <mailto:bier-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bier>, <mailto:bier-request@ietf.org?subject=subscribe>
X-List-Received-Date: Mon, 28 Oct 2019 17:24:45 -0000

Thanks a lot, Jeffrey, inline

Will wait for the end of the week and the push -05.

For the time being, fixes from your input on github, diff here:

http://tools.ietf.org//rfcdiff?url1=http://tools.ietf.org/id/draft-ietf-bier-te-arch-04.txt&url2=https://raw.githubusercontent.com/toerless/bier-te-arch/master/draft-ietf-bier-te-arch-05.txt


cheers
     Toerless

On Mon, Oct 28, 2019 at 02:05:49AM +0000, Jeffrey (Zhaohui) Zhang wrote:
> Hi Toerless,
> 
> Some questions and comments  up to before section 7. I???ll continue to review but it???ll take more time.
> 
> 4.2.  BFER
> 
>    Every BFER is given a unique BitPosition with a local_decap >    adjacency.
> Do you mean every *non-leaf* BFER?

Yes, fixed.

-----

> 4.3.  Leaf BFERs
> 
>    Leaf BFERs are BFERs where incoming BIER-TE packets never need to be
>    forwarded to another BFR but are only sent to the BFER to exit the
>    BIER-TE domain.  For example, in networks where PEs are spokes
>    connected to P routers, those PEs are Leaf BFIRs unless there is a
>    U-turn between two PEs.
> 
> Understand the Leaf BFER definition, but the ???U-turn??? example is hard to picture. Might as well just remove the example.

I Thought u-turn is the most simple comparison leaf vs. non-leaf BFR.

I have now added picture and text to help visualization. Here:


<figure anchor="lan-picture" title="Leaf vs. non-Leaf BFER Example">
<artwork align="left"><![CDATA[
        BFR1(P) BFR2(P)             BFR1(P)  BFR2(P)
          |  \ /  |                    |       |
          |   X   |                    |       |
          |  / \  |                    |       |
     BFER1(PE)  BFER2(PE)        BFER1(PE)----BFER2(PE)

         Leaf BFER /               Non-Leaf BFER /
          PE-router                  PE-router
]]></artwork></figure>

... Consider how redundant disjoint
traffic can reach BFER1/BFER2 in above picture: When BFER1/BFER2
are Non-Leaf BFER as shown on the right hand side, one traffic
copy would be forwarded to BFER1 from BFR1, but the other one
could only reach BFER1 via BFER2, which makes BFER2 a non-Leaf
BFER, and vice-vera BFER1 also a non-Leaf BFER.

-----

>    In a setup with a hub and multiple spokes connected via separate p2p
>    links to the hub, all p2p links can share the same BitPosition.  The
>    BitPosition on the hubs BIFT is set up with a list of
>    forward_connected adjacencies, one for each Spoke.
> 
> Using the BP for the p2p links with a list of forward_connected adjacencies means packets will be sent to all spokes. Unlike the LAN case where BW may not be a concern, the hub-and-spoke connection may have BW concern; or at least it should be pointed out?

How about instead this paragraph outlining a useful application:

<t>This type of optimized BP could be used for example when all
traffic is "broadcast" traffic such as live-TV or situation-awareness
(SA).  This BP optimization can then be used to explicitly steer different
traffic flows across different ECMP paths in Data-Center or broadband-aggregation
networks.</t>

> Section 4.7 started with BP assignment to a link bundle between BFR1 and BFR2. That???s easy to understand.

Right. Most simple topology example for ECMP.

> but subsequent polarization example confuses me. It seems that BP 0:6 is assigned to the routed adjacency BFR10 (which is actually talked about in Section 4.8).

Section 4.7 does not mention "routed" at all, so there are no routed
adjacencies at all used in 4.7. So i am not sure what you are confused
about.

The whole purpose of the ECMP BPs is of course to save bits, otherwise
we'd give each link a separate BP, which would be 6 BP to reach
to BFR4...BFR7 from BFR1. 

> Since tunneling is used, ECMP (and polarization) is just an underlay issue. Anyway, it???s better to discuss polarization in section 4.8.

As saif above, 4.7 is not about the underlay, but explicit ECMP in
BIER-TE. Let me know if you can think of any specific text to
help better avoid that confusion. I thought it is clear from the
text.

> 4.8.  Routed adjacencies
> 
> If I understand it correctly, there is a BP assigned to L1/L2/L3 respectively (p2p link),
> and then there are BPs assigned to MP2P tunnels (routed adjacency from every BFR) to the L1/L2/L3 interface addresses and loopback addresses on BFR2/3.

Ok that wasn't quite the read i expected. Let me clarify the text/picture:

                   ...............     
         ...BFR1--...           ...--L1-- BFR2...
                  ... .Routers. ...--L2--/  
         ...BFR4--...           ...------ BFR3...
                   ...............         |
                                          LO
                    Network Area 1

Assume the requirement in the above picture is to explicitly steer
traffic flows that have arrived at BFR1 or BFR4 via a shortest path 
in the routing underlay "network area 1" to one of the following three
next segments: (1) BFR2 via link L1, (2) BFR2 via link L2, (3) via BFR3.

To achieve this, both BFR1 and BFR4 are set up with a forward_routed
adjacency BitPosition towards an address of BFR2 on link L1, another
forward_routed BitPosition towards an address of BFR2 on link L2 and a
third forward_routed Bitposition towards a node address LO of BFR3.

Does this clear ip the confusion ?

> If BFR2/3 are also BFERs, then they additionally will have BFER BPs.
> On BFR1/4, the BIFT entries for the MP2P BPs for the L1/L2/L3/loopback interface addresses of BFR2/3 will use forward_routed(interface/loopback address). For a packet to be decapsulated on a BFER, there is a need for both the BFER BP and another BP (p2p/lan/hub-spoke/routed-adjacency) in the packet (the former is for decapsulation and the latter is for getting it there).

This is not discussed in this section, but you are right - unless
BFR2 or BFR3 is a leaf BFR. In that case, it would just leverage
the one shared "leaf-BFR" BP, so they do not need a per-BFER BP for
local_decap(). 

> If that???s the case, it???s worth point the above out.

Hmm... The logic of BFER BPs is totally independent of the logic
of forward_routed adjacency, so i would worry that repeating the
explanation of BFER BPs would conflate the forward_routed
explanation.

> Would it be better to fold sub-section 4.8.2 into section 3.2.2?

I don't think so. Section 3. is about the behavior of the BIFT,
without trying to understand why and how to create those BIFT entries.

Section 4. is about the controllers (or operators) logic figuring
out what BPs are needed depending on topology and required 
BIER-TE traffic engineering policies and how BPs can be saved.

> Actually, the reason that I thought this is MP2P is that 0:6 is present on R1, R2, and R3 (and more I assume) in Figure 12, but now I think it can???t be MP2P (so it is not correct to have 0:6 present on those routers ??? only the p2p tunnel head/tail should have the BP present in the BIFT). The reason is that if it were MP2P, any router getting a copy will send it to the endpoint of the routed adjacency, causing lots of duplicates.
> 
> Am I getting this correct?

I think you are still explaining from the misunderstsanding that the
ECMP explanations where about routed adjacencies.

I have now expanded the somewhat terse text in the BIFT table pictures,
to make it clear that the ECMP is across multipe forward_connected
adjacencies in the examples. For example, first BIFT picture:

  BIFT entry in BFR1:
  ------------------------------------------------------------------
  | Index |  Adjacencies                                           |
  ==================================================================
  | 0:6   |  ECMP({forward_connected(L1, BFR2),                    |
  |       |        forward_connected(L2, BFR2),                    |
  |       |        forward_connected(L3, BFR2)}, seed)             |
  ------------------------------------------------------------------

Of course, an ECMP adjacency can be across any type of adjacencies,
but all the text/explanations used forward_connected, and now the
pictures show that explicitly.

>    To inhibit looping in the face of such physical misconfiguration,
>    only forward_connected adjacencies are permitted to have DNR set, and
>    the link layer destination address of the adjacency (e.g.  MAC
>    address) protects against closing the loop.  Link layers without port
>    unique link layer addresses should not be used with the DNR flag set.
>
> It???s not clear how link layer address helps?

I have expanded this to
"link layer port unique unicast destination address"

Aka: MPLS or ethernet have unique link layer destination destination addresses
(label or destination MAC). If you think about incorrectly plugged
HDLC links (such as old T1/T3/... links), they only have 2 generic
addresses, if i remember 1 or 3 in the HDLC frame. So when you 
misplug one of those p2p cables wrong, the packets would be
incrrectly received by the wrong receiver node and then DNR could
cause persistent loops only solved by TTL.

>       void ForwardBitMaskPacket_withTE (Packet)
>       {
>           SI=GetPacketSI(Packet);
>           Offset=SI*BitStringLength;
>           for (Index = GetFirstBitPosition(Packet->BitString); Index ;
>                Index = GetNextBitPosition(Packet->BitString, Index)) {
>               F-BM = BIFT[Index+Offset]->F-BM;
>               if (!F-BM) continue;
>               BFR-NBR = BIFT[Index+Offset]->BFR-NBR;
>               PacketCopy = Copy(Packet);
>               PacketCopy->BitString &= F-BM;                  [2]
>               PacketSend(PacketCopy, BFR-NBR);
>               // The following must not be done for BIER-TE:
>               // Packet->BitString &= ~F-BM;                  [1]
>           }
>       }
> 
> Because the forwarding is different from BIER forwarding (because of [1] above), we might as well introduce an optimization here ??? for each BIFT, calculate the F-BM of the BIFT itself (the logical ???or??? of all the BPs presented in this BIFT) and then use (packet->bitstring & BIFT.F-BM) as the input to GetFirst/NextBitPosition(). That should skip many bits.

Right. But i explicitly removed those optimizations (i had them in older
draft versions) because the whole idea of this picture is solely the
comparison with figure 4 of RFC8279.

>    Eliminating the need to perform [1] also makes processing of bits in
>    the BIER-TE bitstring independent of processing other bits, which may
>    also simplify forwarding plane implementations.

> Don???t see how it simplifies forwarding plane implementation;

The way i figure (and i may not have thought up all BIER optimizations),
[1] does create more sequentialization or higher complexity to overcome
it. With BIER-TE i can pre-calculate a mask of BR relevant for an
egres linecard and only look at those bits, and this mask does not need
to include bits that would have caused copied to interfaces on other
linecards.

> is the last cause (???which may ??????) meant to say it leads to deterministic forwarding behavior????

As in deterministic RFC8279 ECMP, yes.

But i would prefer not to reuse that word. Too overloaded (see DetNet WG).

"Independent BP processing" would be a better explanatory term,
but i was trying to avoid creating too much new terminology.

>    The following pseudocode is comprehensive:
> 
> The above sentence reads a bit strange (or lacks some segue).

I hope not, but maybe best left to a native english speaker (RFC-editor).

The first (RFC8279) pseudocode was simplified. The second one is
comprehensive. If not comprehensive, whats a good opposite of simplified ?

>    For BIER and BIER-TE forwarding, the most important result of using
>    multiple SI and/or subdomains is the same: Packets that need to be
>    sent to BFER in different SI or subdomains require different BIER
>    packets: each one with a bitstring for a different (SI,subdomain)
>    bitstring.
> 
> Should the last ???bitstring??? be ???combination????

Yes. thanks.

> Some editorial nits:
> 
>    Forwarding of BIER-TE is designed to allow common forwarding hardware
>    with BIER.  In fact, one of the main goals of this document is to
>    encourage the building of forwarding hardware that cannot only
>    support BIER, but also BIER-TE - to allow experimentation with BIER-
>    TE and support building of BIER-TE control plane code.
> 
> Should ???cannot only??? be ???can not only????

thanks.

> Additionally, curious why you say ???controller host??? ??? typically people just say ???controller???.

Yes, the joke is getting old now. Will simplify term.

>    This optimization does not work in the face of BFRs redundantly
>    connected to more than one LANs with this optimization because these
>    BFRs would receive duplicates and forward those duplicates into the
>    opposite LANs.  Adjacencies of such BFRs into their LANs still need a
>    separate BitPosition.
> 
> s/face/case/?

Ack.

>    In a setup with a hub and multiple spokes connected via separate p2p
>    links to the hub, all p2p links can share the same BitPosition.  The
>    BitPosition on the hubs BIFT is set up with a list of
>    forward_connected adjacencies, one for each Spoke.
> 
> s/hubs/hub's/?

Ack.

Thanks a lot.

Cheers
    toerless
   
> 
> 
> Thanks.
> Jeffrey
> 
> 
> ________________________________________
> From: BIER [bier-bounces@ietf.org<mailto:bier-bounces@ietf.org>] on behalf of Toerless Eckert [tte@cs.fau.de<mailto:tte@cs.fau.de>]
> Sent: Tuesday, July 09, 2019 23:38
> To: Mike McBride
> Cc: Greg Shepherd; BIER WG; Pascal Thubert (pthubert)
> Subject: Re: [Bier] WGLC - draft-ietf-bier-te-arch
> 
> Thanks, Mike
> 
> The authors also reviewed the document and concluded that it was really
> hard to get into the document context because of too many forward
> dependencies. We tried to fix this by adding two hopefully good & basic
> examples into the Introduction section and using them to also add
> a better definition of the term "BIER-TE Topology" in the Introduction.
> Hopefully this makes readin the rest of te document smoother.
> 
> Also improved text of Abstract and refined text compariing BIER-TE with SR.
> 
> http://tools.ietf.org//rfcdiff?url1=https://tools.ietf.org/id/draft-ietf-bier-te-arch-02.txt&url2=https://tools.ietf.org/id/draft-ietf-bier-te-arch-03.txt<https://urldefense.com/v3/__http:/tools.ietf.org/*rfcdiff?url1=https:**Atools.ietf.org*id*draft-ietf-bier-te-arch-02.txt&url2=https:**Atools.ietf.org*id*draft-ietf-bier-te-arch-03.txt__;Ly8vLy8vLy8v!8WoA6RjC81c!UBTGvWWpMHyeiSanxs6vIb_EnBVgyg6boAAW4nrqju8UCLOgiuXc8Y_6sNd1njcX$>
> 
> Cheers
>     Toerless
> 
> On Wed, Jun 26, 2019 at 10:39:36AM -0700, Mike McBride wrote:
> > How about three? I support.
> > mike
> >
> > On Tue, Jun 25, 2019 at 10:42 AM Greg Shepherd <gjshep@gmail.com<mailto:gjshep@gmail.com>> wrote:
> > >
> > > We cannot take two 'yes' votes and WG consensus.
> > > Please, read and respond. If you don't support, then please vote as much publicly right here.
> > >
> > > Thanks,
> > > Greg
> > >
> > > On Mon, Jun 3, 2019 at 10:05 PM Pascal Thubert (pthubert) <pthubert@cisco.com<mailto:pthubert@cisco.com>> wrote:
> > >>
> > >> Support:
> > >>
> > >> I see great value in deterministic networks as well as IOT (with RPL).
> > >>
> > >> All the best,
> > >>
> > >> Pascal
> > >>
> > >> > -----Original Message-----
> > >> > From: BIER <bier-bounces@ietf.org<mailto:bier-bounces@ietf.org>> On Behalf Of Toerless Eckert
> > >> > Sent: mardi 4 juin 2019 02:03
> > >> > To: Greg Shepherd <gjshep@gmail.com<mailto:gjshep@gmail.com>>
> > >> > Cc: BIER WG <bier@ietf.org<mailto:bier@ietf.org>>
> > >> > Subject: Re: [Bier] WGLC - draft-ietf-bier-te-arch
> > >> >
> > >> > +1
> > >> > Obviously support as co-author.
> > >> >
> > >> > On Wed, May 29, 2019 at 12:41:26PM -0700, Greg Shepherd wrote:
> > >> > > Please read and respond to this thread w/ or w/o support.
> > >> > >
> > >> > > https://datatracker..ietf.org/doc/draft-ietf-bier-te-arch/<https://urldefense.com/v3/__https:/datatracker.ietf.org/doc/draft-ietf-bier-te-arch/__;!8WoA6RjC81c!UBTGvWWpMHyeiSanxs6vIb_EnBVgyg6boAAW4nrqju8UCLOgiuXc8Y_6sD40kmtH$>
> > >> > >
> > >> > > Vote ends 5 June 2019.
> > >> > >
> > >> > > Thanks,
> > >> > > Shep
> > >> > > (chairs)
> > >> >
> > >> > > _______________________________________________
> > >> > > BIER mailing list
> > >> > > BIER@ietf.org<mailto:BIER@ietf.org>
> > >> > > https://www.ietf.org/mailman/listinfo/bier<https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/bier__;!8WoA6RjC81c!UBTGvWWpMHyeiSanxs6vIb_EnBVgyg6boAAW4nrqju8UCLOgiuXc8Y_6sKn2KoAT$>
> > >> >
> > >> > _______________________________________________
> > >> > BIER mailing list
> > >> > BIER@ietf.org<mailto:BIER@ietf.org>
> > >> > https://www.ietf.org/mailman/listinfo/bier<https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/bier__;!8WoA6RjC81c!UBTGvWWpMHyeiSanxs6vIb_EnBVgyg6boAAW4nrqju8UCLOgiuXc8Y_6sKn2KoAT$>
> > >
> > > _______________________________________________
> > > BIER mailing list
> > > BIER@ietf.org<mailto:BIER@ietf.org>
> > > https://www.ietf.org/mailman/listinfo/bier<https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/bier__;!8WoA6RjC81c!UBTGvWWpMHyeiSanxs6vIb_EnBVgyg6boAAW4nrqju8UCLOgiuXc8Y_6sKn2KoAT$>
> 
> --
> ---
> tte@cs.fau.de<mailto:tte@cs.fau.de>
> 
> _______________________________________________
> BIER mailing list
> BIER@ietf.org<mailto:BIER@ietf.org>
> https://www.ietf.org/mailman/listinfo/bier<https://urldefense.com/v3/__https:/www.ietf.org/mailman/listinfo/bier__;!8WoA6RjC81c!UBTGvWWpMHyeiSanxs6vIb_EnBVgyg6boAAW4nrqju8UCLOgiuXc8Y_6sKn2KoAT$>

-- 
---
tte@cs.fau.de