Re: [Bier] Questions on draft-eckert-bier-te-arch-01

Toerless Eckert <eckert@cisco.com> Fri, 16 October 2015 19:41 UTC

Return-Path: <eckert@cisco.com>
X-Original-To: bier@ietfa.amsl.com
Delivered-To: bier@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D9ECF1B3358 for <bier@ietfa.amsl.com>; Fri, 16 Oct 2015 12:41:29 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -14.511
X-Spam-Level:
X-Spam-Status: No, score=-14.511 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, RCVD_IN_DNSWL_HI=-5, SPF_PASS=-0.001, T_RP_MATCHES_RCVD=-0.01, USER_IN_DEF_DKIM_WL=-7.5] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id byDfoubImWwg for <bier@ietfa.amsl.com>; Fri, 16 Oct 2015 12:41:27 -0700 (PDT)
Received: from rcdn-iport-4.cisco.com (rcdn-iport-4.cisco.com [173.37.86.75]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id B41D81AD34C for <bier@ietf.org>; Fri, 16 Oct 2015 12:41:27 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=7596; q=dns/txt; s=iport; t=1445024487; x=1446234087; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=9vJM5CacOklh/V6uUxmFfO4p99f0rlzrTq7GELl2Bz4=; b=EwN7H4mSu1GmlujFkJFeC8OIESdwv1K+8BnorBC8UnLYQzYYiUcZlS1E s3tNWoj+sJ/wUoTnXGxZgkyBPRh6/or50wyTlkusZxVrAZRG7je7hDokt /IcbBVuk5E8nNm5AT7vrDc64nqs7iwhzJvYQwNgluFuX88BWuQiAEI6pe 0=;
X-IronPort-AV: E=Sophos;i="5.17,690,1437436800"; d="scan'208";a="38275077"
Received: from alln-core-4.cisco.com ([173.36.13.137]) by rcdn-iport-4.cisco.com with ESMTP; 16 Oct 2015 19:41:27 +0000
Received: from mcast-linux1.cisco.com (mcast-linux1.cisco.com [172.27.244.121]) by alln-core-4.cisco.com (8.14.5/8.14.5) with ESMTP id t9GJfQxp028210 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO); Fri, 16 Oct 2015 19:41:27 GMT
Received: from mcast-linux1.cisco.com (localhost.cisco.com [127.0.0.1]) by mcast-linux1.cisco.com (8.13.8/8.13.8) with ESMTP id t9GJfQI4024553; Fri, 16 Oct 2015 12:41:26 -0700
Received: (from eckert@localhost) by mcast-linux1.cisco.com (8.13.8/8.13.8/Submit) id t9GJfQor024552; Fri, 16 Oct 2015 12:41:26 -0700
Date: Fri, 16 Oct 2015 12:41:26 -0700
From: Toerless Eckert <eckert@cisco.com>
To: Eric C Rosen <erosen@juniper.net>
Message-ID: <20151016194126.GA21691@cisco.com>
References: <55DF5BAD.9060003@juniper.net> <20151007221035.GA26709@cisco.com> <561BFD2C.8090708@juniper.net> <20151013000257.GA13911@cisco.com> <56214683.5090900@juniper.net>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"
Content-Disposition: inline
In-Reply-To: <56214683.5090900@juniper.net>
User-Agent: Mutt/1.4.2.2i
Archived-At: <http://mailarchive.ietf.org/arch/msg/bier/agcocbYg5_44tUrMlLQwr0_4H4c>
Cc: "BIER (bier@ietf.org)" <bier@ietf.org>, Neale Ranns <nranns@cisco.com>
Subject: Re: [Bier] Questions on draft-eckert-bier-te-arch-01
X-BeenThere: bier@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: "\"Bit Indexed Explicit Replication discussion list\"" <bier.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/bier>, <mailto:bier-request@ietf.org?subject=unsubscribe>
List-Archive: <https://mailarchive.ietf.org/arch/browse/bier/>
List-Post: <mailto:bier@ietf.org>
List-Help: <mailto:bier-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/bier>, <mailto:bier-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 16 Oct 2015 19:41:30 -0000

On Fri, Oct 16, 2015 at 02:48:35PM -0400, Eric C Rosen wrote:
> >>[Eric] in BIER-TE forwarding the BFIR-id doesn't require a bit in the
> >>BitString, so eliminating the BFIR-id doesn't really seem to provide
> >>much value.
> 
> >[Toerless] 16 bit header saved.
> 
> I think that would count as "not much value" ;-)  FWIW, originally the 
> MPLS encapsulation made the BFIR-id an optional field, but then you need 
> a flag bit to say whether it is present, and it just seemed simpler to 
> always include it.

Agreed. I forgot the smiley in my text.

> >[Toerless] If BFR is leaf BFER, then controller only assigns bit to the
> >links leading up to the leaf BFER, not the BFER itself.
> 
> >For all non-leaf BFER, controller assigns an individual bit.
> 
> Got it, I didn't attend properly to the fact that non-leaf BFERs have 
> their own bits.
> 
> Maybe I should have said "almost got it" ;-)  Under what conditions will 
> a leaf node receive a packet that it doesn't need to consume?  That is, 
> what is the significance of failing to set the local_decaps bit?

With todays scope of BIER-TE only when the bitstring had some bit that
was assigned to an interface into the leaf. 

We could easily avoid the single "shared leaf bit" by having a forwarding
plane feature "you're a leaf BFER, every BIER-TE packet you receive
needs to be punted". But thats a new forwarding plane feature, the
"shared leaf bit" is just a bit allocation trick, so it keeps the forwarding
plane simpler/cleaner. 

> Doesn't this scheme make it difficult to add a new BFER for an existing 
> flow, or to change a non-leaf node from being a non-BFER to being a BFER 
> for an existing flow?

Lets say you have a BFER with 2 upstream links into it. And the
bitstring already has enough bits to get the packet into both routers
north of these links.

If its a leaf BFER, you just add one of the two link bits to get
it to the leaf BFER. And the choice which one you use is your
traffic engineering aspect. And we assume that the shared-leaf bit
was already set for some other leaf BFER somewhere else in the network.

If its not a leaf BFER, you still need to set either of these two bits
for the two links, but now you also need to set one unique bit for
the BFER itself so the BFER can punt it. Because you may not want this
BFER to punt it, but instead just some other BFER further south of it.

> In non-TE BIER, you could also set up each country as a separate BIER 
> sub-domain (or even a completely separate BIER domain, if we had 
> inter-domain signaling of some sort).  Something like that might be 
> needed anyway for inter-AS BIER.

Sure. I hope BIER-TE can be agnostic and follow the same design principles:

- Use SI pretty much as in BIER to enable using same flow overly and
  routing underlay that relies on the BFR-ID (that implies SI).
- Use same subdomain as BIER or separate one based on the bit index
  allocation you use for BIER (do you have enough spare bit indices
  you can use in BIER-TE ? No ? -> use separate sub-domain).

> Think about an application like MVPN extranet, where one (S,G) flow is 
> traveling through one sub-domain, another (S,G) flow is traveling 
> through another sub-domain, and you have to make sure that each flow 
> gets punted to the proper VRF.  If a given BFER for that flow is in both 
> sub-domains, you'd need to know the sub-domain in order to dispatch to 
> the proper VRF.  But I guess this could happen in non-TE BIER as well if 
> different VPNs are using different sub-domains.

The VRF to receive into is a parameter of the local_decap adjacency.
I would prefer i only need one bit per BFER per subdomain, which
for the default sub-domain would likely mean to receive only into
the global table and then the next encap, eg: MPLS label determines
which VRF to actually process the packet in. If you assume you have
multiple sub-domains, then you'd need to have a bit for the BFER in
each sub-domain anyhow, so each of these bits could have a local_decap
with a different VRF.

The shared leaf-BFER local_decap bit would likely still work  in
this context: you have some subdomain that likely is used for
a particular VPN context. All leaf-BFER with that sub-domain
would have this bit populated with local_decap(VRF1). And
the shared leaf-BFER bit in another subdomain in the same BFER
would have local_decap(VRF2).

If we tried to eliminate the shared leaf bit, then we would need
to have the forwarding logic: If you receive a BIER-TE packet
for a particular subdomain X, punt to VRF associated with X.

> >[Toerless]  "On BFR2, the adjacency also points to the
> >    clockwise neighbor BFR1, but without DNR set.  "
> 
> >DNR bit is part of the adjacency in the BIFT, so only
> >the copies sent across the ring adjacencies will have it set, and then
> >we do not set DNR towards the end of the ring before it would loop.
> 
> Doesn't this assume that all the BIER-TE packets enter the ring at a 
> single ingress?  If packets can enter the ring at multiple points, what 
> is the meaning of "end of the ring"?

In the example, all packets are entering from one of two "upstream"
routers. That the typical MSO case. If you have ring traffic where
multicast can enter from any node, then the TE design is a bit
more complex: You effectively have to use two bits, one to pass
the packet clockwise around the ring, the other one to pass it
counterclockwise. You still choose one link on the ring where
you break the loop. Whenever you receive a packet fromm outside
the ring, the copy you send clockwise into the ring would already
have the counterclockwise bit reset because the DNR bit is only
keeping the bit set when the packet is replicated across the adjacency
on which the DNR bit is set. Liwise for the copy in the other
direction.

Maybe i should use that more complex example, i was just afraid it
would have readers eyes bleed too much ;-)

> >[Toerless] Multiple rings is just an example. DNR allows you to save bits
> >in bitstring whenever you are fine all BIER-TE traffic is flooded
> >across that sub-topology. Just set up in the BIFTs appropriate
> >adjacencies to flood (loop free) across the topology.
> 
> I see how this would work for a sub-topology that is itself a tree 
> (single ingress).  But if the non-leaf nodes in the sub-topology are 
> also BFERs, you'd still need bits for them.

Yes. From the accounting side you always assume one bit per BFER,
then the question is how many more bits you need for the intervening
topology. Eg: Ring is one or two bits depending how traffic needs
to flow. Leaf BFER you only need additional bits for the second and
further links into the BFER, so non-redundant BFERs are especially
cheap on the bits (this is accounting. Forwarding reality is of course the
first link into the BFER requires a bit but the BFER itself not).

Another way to save bits i have not discussed is to even have 
transit BFER have a local_decap on the leaf-BFER punt bit - and
expect that the next layer can do a fast-drop for the payload if
the packet is not of interest.  But that of course depends highly
on the payload. Of course if you want to just traffic engineer
eg: for a particular sub-domain some stupid L2 broadcast service,
and you're only interested in the bits for the path engineering
in your core, then a single shared bit in that sub-domain for all
BFER is all you need.

Cheers
    Toerless

-- 
---
Toerless Eckert, eckert@cisco.com