Re: [rbridge] network topology constraints in draft-tissa-trill-cmt-00

"Tissa Senevirathne (tsenevir)" <tsenevir@cisco.com> Thu, 29 March 2012 21:14 UTC

Return-Path: <rbridge-bounces@postel.org>
X-Original-To: ietfarch-trill-archive-Osh9cae4@ietfa.amsl.com
Delivered-To: ietfarch-trill-archive-Osh9cae4@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 23CA821E8034 for <ietfarch-trill-archive-Osh9cae4@ietfa.amsl.com>; Thu, 29 Mar 2012 14:14:52 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -5.067
X-Spam-Level:
X-Spam-Status: No, score=-5.067 tagged_above=-999 required=5 tests=[AWL=-0.920, BAYES_00=-2.599, HTML_MESSAGE=0.001, MIME_CHARSET_FARAWAY=2.45, MIME_HTML_MOSTLY=0.001, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([12.22.58.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id fyeZXccOK8wQ for <ietfarch-trill-archive-Osh9cae4@ietfa.amsl.com>; Thu, 29 Mar 2012 14:14:50 -0700 (PDT)
Received: from boreas.isi.edu (boreas.isi.edu [128.9.160.161]) by ietfa.amsl.com (Postfix) with ESMTP id 8FA6721E8028 for <trill-archive-Osh9cae4@lists.ietf.org>; Thu, 29 Mar 2012 14:14:50 -0700 (PDT)
Received: from boreas.isi.edu (localhost [127.0.0.1]) by boreas.isi.edu (8.13.8/8.13.8) with ESMTP id q2TKwixL010775; Thu, 29 Mar 2012 13:58:46 -0700 (PDT)
Received: from mtv-iport-2.cisco.com (mtv-iport-2.cisco.com [173.36.130.13]) by boreas.isi.edu (8.13.8/8.13.8) with ESMTP id q2TKvvc1010658 (version=TLSv1/SSLv3 cipher=RC4-SHA bits=128 verify=NOT) for <rbridge@postel.org>; Thu, 29 Mar 2012 13:58:06 -0700 (PDT)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=tsenevir@cisco.com; l=31327; q=dns/txt; s=iport; t=1333054686; x=1334264286; h=mime-version:subject:date:message-id:in-reply-to: references:from:to; bh=Sc/di+x2+g+UcW5+kSB1A3pRub1apuUh+hmwTwxmJ04=; b=AJTC4585CLFeCc+rOnV8KjhRwYxjG0No2L+paxXwCJ5a6zhzdGt2lKKu z1vrJm3lUZ9QQAbStCIncHss/la+hhcgXD8sliCQwoFBtqTpq2QHLDu3F +ZBWpqWxzrr/boP93kFlvl7LsW2oOUv4fnGr8OESX78fyy0LCJ+V2sOvi I=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AgcFAILMdE+rRDoH/2dsb2JhbABCglOCf6peAYhwgQeCCQEBAQQSAQkRA1kCAQYCEQMBAQELBhABBgEEAgFFCQgCBAESCBMHh2cBC59MjQIIijaPcTljBIhYjhqNNYFogwc
X-IronPort-AV: E=Sophos; i="4.75,669,1330905600"; d="scan'208,217"; a="38272208"
Received: from mtv-core-2.cisco.com ([171.68.58.7]) by mtv-iport-2.cisco.com with ESMTP; 29 Mar 2012 20:57:57 +0000
Received: from xbh-sjc-221.amer.cisco.com (xbh-sjc-221.cisco.com [128.107.191.63]) by mtv-core-2.cisco.com (8.14.3/8.14.3) with ESMTP id q2TKvvFH000364; Thu, 29 Mar 2012 20:57:57 GMT
Received: from xmb-sjc-214.amer.cisco.com ([171.70.151.145]) by xbh-sjc-221.amer.cisco.com with Microsoft SMTPSVC(6.0.3790.4675); Thu, 29 Mar 2012 13:57:57 -0700
X-MimeOLE: Produced By Microsoft Exchange V6.5
Content-class: urn:content-classes:message
MIME-Version: 1.0
Date: Thu, 29 Mar 2012 13:57:54 -0700
Message-ID: <344037D7CFEFE84E97E9CC1F56C5F4A5DA0D5D@xmb-sjc-214.amer.cisco.com>
In-Reply-To: <4552F0907735844E9204A62BBDD325E728CB0A31@SZXEML507-MBS.china.huawei.com>
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
Thread-Topic: [rbridge] network topology constraints in draft-tissa-trill-cmt-00
Thread-Index: AQHNDTp5ySLxeto+ZUm+nl6F5ko6IZaA/oKAgAC6yvGAAAf1UA==
References: <OF71D13FAE.515DCA1B-ON872579CF.0065A890-882579CF.00790240@us.ibm.com><344037D7CFEFE84E97E9CC1F56C5F4A5DA0A14@xmb-sjc-214.amer.cisco.com><OFC560EF73.ECE32753-ON872579CF.0082C392-882579D0.005EE094@us.ibm.com> <4552F0907735844E9204A62BBDD325E728CB0A31@SZXEML507-MBS.china.huawei.com>
From: "Tissa Senevirathne (tsenevir)" <tsenevir@cisco.com>
To: Mingui Zhang <zhangmingui@huawei.com>, Santosh Rajagopalan <sunny.rajagopalan@us.ibm.com>, rbridge@postel.org
X-OriginalArrivalTime: 29 Mar 2012 20:57:57.0001 (UTC) FILETIME=[9E21DF90:01CD0DEE]
X-ISI-4-43-8-MailScanner: Found to be clean
X-MailScanner-From: tsenevir@cisco.com
Subject: Re: [rbridge] network topology constraints in draft-tissa-trill-cmt-00
X-BeenThere: rbridge@postel.org
X-Mailman-Version: 2.1.6
Precedence: list
List-Id: "Developing a hybrid router/bridge." <rbridge.postel.org>
List-Unsubscribe: <http://mailman.postel.org/mailman/listinfo/rbridge>, <mailto:rbridge-request@postel.org?subject=unsubscribe>
List-Archive: <http://mailman.postel.org/pipermail/rbridge>
List-Post: <mailto:rbridge@postel.org>
List-Help: <mailto:rbridge-request@postel.org?subject=help>
List-Subscribe: <http://mailman.postel.org/mailman/listinfo/rbridge>, <mailto:rbridge-request@postel.org?subject=subscribe>
Content-Type: multipart/mixed; boundary="===============0263727747=="
Sender: rbridge-bounces@postel.org
Errors-To: rbridge-bounces@postel.org

When an RBrdige goes down, based on different scenarios today all of the RBridges in the network recalculate the SPF.   We are not optimizing IS-IS protocol calculation methods and there can be many different way to optimize such things, that is not the intention of a base draft.

 

 

From: rbridge-bounces@postel.org [mailto:rbridge-bounces@postel.org] On Behalf Of Mingui Zhang
Sent: Thursday, March 29, 2012 1:38 PM
To: Santosh Rajagopalan; rbridge@postel.org
Subject: 答复: [rbridge] network topology constraints in draft-tissa-trill-cmt-00

 

Hi Sunny, 

 

I agree with your judgement about the LAG link failure. 

 

I have another concern with 5.6.1. Edge RBridge RBk failure. I think that CMT should have the chance to offer an incremental recalculation of distributions trees under nodal failures. In other words, a nodal failure should affect only one distribution tree while others should remain unchanged. But paragraph 3 of this section makes me believe that there is an overall recalculation. 

 

Thanks,

Mingui

 

________________________________

发件人: rbridge-bounces@postel.org [rbridge-bounces@postel.org] 代表 Santosh Rajagopalan [sunny.rajagopalan@us.ibm.com]
发送时间: 2012年3月30日 1:15
到: rbridge@postel.org
主题: Re: [rbridge] network topology constraints in draft-tissa-trill-cmt-00

Tissa, 
Are you saying that the solution is only applicable to multi-homed hosts or that it's also applicable to edge switches? (the draft says the example is valid for "either End Stations and/or Legacy bridges")? If you're expanding the scope to include the latter, then the inability to interconnect them  is a serious deficiency, as are the other points listed below. Anyway if the solution if being proposed only for a narrow case, it should be mentioned clearly in the draft - it's not enough for it to be "discussed in Taipei". The draft talks about a "typical deployment scenario" without stating that this is the only possible deployment scenario. 

Even if you narrow the scope considerably, you still need to address the issue of LAG link failure - (see point # 3) below. In this draft, a single link failure leads to us bringing down all southbound ports on the upstream device, which is quite drastic. This draft needs to address this issue for it to be workable. 

-- 
Sunny Rajagopalan 



From:        "Tissa Senevirathne (tsenevir)" <tsenevir@cisco.com> 
To:        Santosh Rajagopalan/Santa Clara/IBM@IBMUS, <rbridge@postel.org> 
Date:        03/28/2012 04:17 PM 
Subject:        RE: [rbridge] network topology constraints in draft-tissa-trill-cmt-00 

________________________________




The key application of this draft is for multi-homed devices that are servicing as end stations, e.g. servers on data centers. Please see Reference topology 2 and it was discussed in Taipei and stated in the draft that this solution is applicable to multi-homed active-active edge. 
  
From: rbridge-bounces@postel.org [mailto:rbridge-bounces@postel.org <mailto:rbridge-bounces@postel.org> ] On Behalf Of Santosh Rajagopalan
Sent: Wednesday, March 28, 2012 3:01 PM
To: rbridge@postel.org
Subject: [rbridge] network topology constraints in draft-tissa-trill-cmt-00 
  
This is a clever draft, but I wanted to point out some network topology constraints in the proposal: 

1) It looks like you can only use this proposal if the CE switches are *not* interconnected in any fashion outside of the trill network. This is because sending packets into the CE network strips off information needed to prevent loops. Let me illustrate using the example from the draft: RB1 receives a multidestination packet from the trill campus on tree 1, and it also has an "affinity" for that tree. So it decaps it and send the packet into the CE network (basically, a copy gets sent to each of CE1..CEn). 

Let's assume that the CE network is composed of interconnected switches, instead of the isolated switches shown in the picture. This is reasonable, because it avoids needing to take the extra hop to the aggregation layer for end-systems on the same chassis or rack. This means that a broadcast packet would be replicated by the CE network to each of its switches, including CE1...CEn. So CE1..CEn just got their first duplicate. Each of these switches looks at the attached rbridges as edge ports, so it sends them a copy. Now, rbridges RB1..RBk each label the ingressing packet with their respective "affinity" tree labels and sends them into the TRILL network, where it gets to the edge of the trill network, and the cycle repeats. You now have a loop. 

In addition, if you had interconnectivity between the CE switches, then the edge rbridges would be able to exchange LSPs with each other, which will result in one of them being elected the AF. The others will then not encap or decap CE packets. So the affinity based approach would conflict with  RFC 6325. All in all, we need the CE switches to be isolated here. 

2) Applying an STP-based solution like the one described in RFC 6325 ("The Spanning Tree Solution <http://tools.ietf.org/html/rfc6325#appendix-A.3.3> ") to  break the connectivity between the CE switches won't work here, because this will render certain switches unreachable on some trees. In figure 13 ("wiring closet topology") in rfc 6325, if RB1 has an affinity for tree k, then packets coming in from the trill cloud on that tree will need to get to B2 through RB1, but since STP has blocked B1-B2 this won't happen. This just reiterates that no form of interconnect whatsoever between the CE switches is permissible, and "The Spanning Tree Solution <http://tools.ietf.org/html/rfc6325#appendix-A.3.3> " will not work here. 

3) In addition to the above constraint, each CE switch needs to be connected to every rbridge, and the consequence of any of the LAG links going down are catastrophic. Also, each rbridge needs to have a vLAG each to each CE switch in the LAN. This is necessary because the entire CE network has been "emulated" by the pseudo-rbridge (RBv) in the draft. 

Let's say a packet arrives from the trill core at a certain rbridge on a tree that it has an affinity for. The assumption is that by decapsulating the packet and sending it to each of the attached CE links, all the stations in the CE network will get the packet. So if there's a certain CE switch which isn't connected to this rbridge, it will not get the packet (the packet can't get to the CE switch through another CE switch, because of the constraint in 1) above). 

This means that a) each rbridge needs to have n vLAGs, one for each CE switch, and b) each CE switch needs to have k ports in its LAG, one for each rbridge. Note that most switches have scalability constraints on the number of LAG members and on the number of VLAGs. For small networks this may not be a problem, however, this may still be a problem if one of your links on the LAG goes down. In that case, that CE switch will get permanently blackholed for some trees. (Essentially, the upstream rbridge on the other end of the down link no longer has any way of reaching the CE switch on the x trees it has an affinity for) 

At the very least, this proposal needs a way for an rbridge to "relinquish" its affinity trees when any VLAG link goes down, and a way those trees to either be retired or be picked up by other rbridges. In addition, the rbridge will need to bring all of its CE-facing links down, so that the CE bridges don't try to use that rbridge to inject packets into the TRILL network. 

4) Because of the constraint imposed by 1), you cannot interconnect two trill clouds using an intermediate CE cloud - the trill clouds will need to be merged using p2p trill links. This could be a problem if you plan to incrementally upgrade your switches to trill, as opposed to a fork-lift upgrade of your whole data center. 

Note that the existing version of RFC 6325 does not have constraints on interconnectivity of CE switches or rbridges as described above. 

Thoughts? 

-- 
Sunny Rajagopalan 

_______________________________________________
rbridge mailing list
rbridge@postel.org
http://mailman.postel.org/mailman/listinfo/rbridge