[rbridge] network topology constraints in draft-tissa-trill-cmt-00

Santosh Rajagopalan <sunny.rajagopalan@us.ibm.com> Wed, 28 March 2012 22:12 UTC

Return-Path: <rbridge-bounces@postel.org>
X-Original-To: ietfarch-trill-archive-Osh9cae4@ietfa.amsl.com
Delivered-To: ietfarch-trill-archive-Osh9cae4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 61A5B21E80E9 for <ietfarch-trill-archive-Osh9cae4@ietfa.amsl.com>; Wed, 28 Mar 2012 15:12:05 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.277
X-Spam-Level:
X-Spam-Status: No, score=-6.277 tagged_above=-999 required=5 tests=[AWL=0.321, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QTUylAnQ5meD for <ietfarch-trill-archive-Osh9cae4@ietfa.amsl.com>; Wed, 28 Mar 2012 15:12:04 -0700 (PDT)
Received: from boreas.isi.edu (boreas.isi.edu [128.9.160.161]) by ietfa.amsl.com (Postfix) with ESMTP id B14BA21E80BB for <trill-archive-Osh9cae4@lists.ietf.org>; Wed, 28 Mar 2012 15:12:00 -0700 (PDT)
Received: from boreas.isi.edu (localhost [127.0.0.1]) by boreas.isi.edu (8.13.8/8.13.8) with ESMTP id q2SM1LfW006327; Wed, 28 Mar 2012 15:01:22 -0700 (PDT)
Received: from e8.ny.us.ibm.com (e8.ny.us.ibm.com [32.97.182.138]) by boreas.isi.edu (8.13.8/8.13.8) with ESMTP id q2SM0oXq006270 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NOT) for <rbridge@postel.org>; Wed, 28 Mar 2012 15:00:59 -0700 (PDT)
Received: from /spool/local by e8.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for <rbridge@postel.org> from <sunny.rajagopalan@us.ibm.com>; Wed, 28 Mar 2012 18:00:49 -0400
Received: from d01dlp01.pok.ibm.com (9.56.224.56) by e8.ny.us.ibm.com (192.168.1.108) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; Wed, 28 Mar 2012 18:00:47 -0400
Received: from d01relay04.pok.ibm.com (d01relay04.pok.ibm.com [9.56.227.236]) by d01dlp01.pok.ibm.com (Postfix) with ESMTP id D2FE438C8052 for <rbridge@postel.org>; Wed, 28 Mar 2012 18:00:46 -0400 (EDT)
Received: from d03av06.boulder.ibm.com (d03av06.boulder.ibm.com [9.17.195.245]) by d01relay04.pok.ibm.com (8.13.8/8.13.8/NCO v10.0) with ESMTP id q2SM0jjJ208990 for <rbridge@postel.org>; Wed, 28 Mar 2012 18:00:45 -0400
Received: from d03av06.boulder.ibm.com (loopback [127.0.0.1]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVout) with ESMTP id q2SM18r3005242 for <rbridge@postel.org>; Wed, 28 Mar 2012 16:01:08 -0600
Received: from d03nm127.boulder.ibm.com (d03nm127.boulder.ibm.com [9.17.195.18]) by d03av06.boulder.ibm.com (8.14.4/8.13.1/NCO v10.0 AVin) with ESMTP id q2SM18s6005237 for <rbridge@postel.org>; Wed, 28 Mar 2012 16:01:08 -0600
To: rbridge@postel.org
MIME-Version: 1.0
X-KeepSent: 71D13FAE:515DCA1B-872579CF:0065A890; type=4; name=$KeepSent
X-Mailer: Lotus Notes Release 8.5.1FP5 SHF29 November 12, 2010
Message-ID: <OF71D13FAE.515DCA1B-ON872579CF.0065A890-882579CF.00790240@us.ibm.com>
From: Santosh Rajagopalan <sunny.rajagopalan@us.ibm.com>
Date: Wed, 28 Mar 2012 15:00:55 -0700
X-MIMETrack: Serialize by Router on D03NM127/03/M/IBM(Release 8.5.1FP2|March 17, 2010) at 03/28/2012 16:00:43, Serialize complete at 03/28/2012 16:00:43
X-Content-Scanned: Fidelis XPS MAILER
x-cbid: 12032822-9360-0000-0000-000004F38917
X-ISI-4-43-8-MailScanner: Found to be clean
X-MailScanner-From: sunny.rajagopalan@us.ibm.com
Subject: [rbridge] network topology constraints in draft-tissa-trill-cmt-00
X-BeenThere: rbridge@postel.org
X-Mailman-Version: 2.1.6
Precedence: list
List-Id: "Developing a hybrid router/bridge." <rbridge.postel.org>
List-Unsubscribe: <http://mailman.postel.org/mailman/listinfo/rbridge>, <mailto:rbridge-request@postel.org?subject=unsubscribe>
List-Archive: <http://mailman.postel.org/pipermail/rbridge>
List-Post: <mailto:rbridge@postel.org>
List-Help: <mailto:rbridge-request@postel.org?subject=help>
List-Subscribe: <http://mailman.postel.org/mailman/listinfo/rbridge>, <mailto:rbridge-request@postel.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============0976922719=="
Sender: rbridge-bounces@postel.org
Errors-To: rbridge-bounces@postel.org

This is a clever draft, but I wanted to point out some network topology 
constraints in the proposal:

1) It looks like you can only use this proposal if the CE switches are 
*not* interconnected in any fashion outside of the trill network. This is 
because sending packets into the CE network strips off information needed 
to prevent loops. Let me illustrate using the example from the draft: RB1 
receives a multidestination packet from the trill campus on tree 1, and it 
also has an "affinity" for that tree. So it decaps it and send the packet 
into the CE network (basically, a copy gets sent to each of CE1..CEn). 

Let's assume that the CE network is composed of interconnected switches, 
instead of the isolated switches shown in the picture. This is reasonable, 
because it avoids needing to take the extra hop to the aggregation layer 
for end-systems on the same chassis or rack. This means that a broadcast 
packet would be replicated by the CE network to each of its switches, 
including CE1...CEn. So CE1..CEn just got their first duplicate. Each of 
these switches looks at the attached rbridges as edge ports, so it sends 
them a copy. Now, rbridges RB1..RBk each label the ingressing packet with 
their respective "affinity" tree labels and sends them into the TRILL 
network, where it gets to the edge of the trill network, and the cycle 
repeats. You now have a loop.

In addition, if you had interconnectivity between the CE switches, then 
the edge rbridges would be able to exchange LSPs with each other, which 
will result in one of them being elected the AF. The others will then not 
encap or decap CE packets. So the affinity based approach would conflict 
with  RFC 6325. All in all, we need the CE switches to be isolated here.

2) Applying an STP-based solution like the one described in RFC 6325 ("The 
Spanning Tree Solution") to  break the connectivity between the CE 
switches won't work here, because this will render certain switches 
unreachable on some trees. In figure 13 ("wiring closet topology") in rfc 
6325, if RB1 has an affinity for tree k, then packets coming in from the 
trill cloud on that tree will need to get to B2 through RB1, but since STP 
has blocked B1-B2 this won't happen. This just reiterates that no form of 
interconnect whatsoever between the CE switches is permissible, and "The 
Spanning Tree Solution" will not work here.

3) In addition to the above constraint, each CE switch needs to be 
connected to every rbridge, and the consequence of any of the LAG links 
going down are catastrophic. Also, each rbridge needs to have a vLAG each 
to each CE switch in the LAN. This is necessary because the entire CE 
network has been "emulated" by the pseudo-rbridge (RBv) in the draft. 

Let's say a packet arrives from the trill core at a certain rbridge on a 
tree that it has an affinity for. The assumption is that by decapsulating 
the packet and sending it to each of the attached CE links, all the 
stations in the CE network will get the packet. So if there's a certain CE 
switch which isn't connected to this rbridge, it will not get the packet 
(the packet can't get to the CE switch through another CE switch, because 
of the constraint in 1) above).

This means that a) each rbridge needs to have n vLAGs, one for each CE 
switch, and b) each CE switch needs to have k ports in its LAG, one for 
each rbridge. Note that most switches have scalability constraints on the 
number of LAG members and on the number of VLAGs. For small networks this 
may not be a problem, however, this may still be a problem if one of your 
links on the LAG goes down. In that case, that CE switch will get 
permanently blackholed for some trees. (Essentially, the upstream rbridge 
on the other end of the down link no longer has any way of reaching the CE 
switch on the x trees it has an affinity for)

At the very least, this proposal needs a way for an rbridge to 
"relinquish" its affinity trees when any VLAG link goes down, and a way 
those trees to either be retired or be picked up by other rbridges. In 
addition, the rbridge will need to bring all of its CE-facing links down, 
so that the CE bridges don't try to use that rbridge to inject packets 
into the TRILL network.

4) Because of the constraint imposed by 1), you cannot interconnect two 
trill clouds using an intermediate CE cloud - the trill clouds will need 
to be merged using p2p trill links. This could be a problem if you plan to 
incrementally upgrade your switches to trill, as opposed to a fork-lift 
upgrade of your whole data center.

Note that the existing version of RFC 6325 does not have constraints on 
interconnectivity of CE switches or rbridges as described above.

Thoughts?

--
Sunny Rajagopalan

_______________________________________________
rbridge mailing list
rbridge@postel.org
http://mailman.postel.org/mailman/listinfo/rbridge