Re: [trill] Thoughts on active-active edge
zhai.hongjun@zte.com.cn Fri, 14 December 2012 03:17 UTC
Return-Path: <zhai.hongjun@zte.com.cn>
X-Original-To: trill@ietfa.amsl.com
Delivered-To: trill@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id C564421F8C51; Thu, 13 Dec 2012 19:17:35 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -96.999
X-Spam-Level:
X-Spam-Status: No, score=-96.999 tagged_above=-999 required=5 tests=[BAYES_00=-2.599, HTML_MESSAGE=0.001, MIME_BASE64_TEXT=1.753, MIME_CHARSET_FARAWAY=2.45, USER_IN_WHITELIST=-100, WEIRD_QUOTING=1.396]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id ku76jBts3RPi; Thu, 13 Dec 2012 19:17:34 -0800 (PST)
Received: from zte.com.cn (mx5.zte.com.cn [63.217.80.70]) by ietfa.amsl.com (Postfix) with ESMTP id 0F9D921F8C55; Thu, 13 Dec 2012 19:17:33 -0800 (PST)
Received: from zte.com.cn (unknown [192.168.168.119]) by Websense Email Security Gateway with ESMTP id 2BC111244348; Fri, 14 Dec 2012 11:19:24 +0800 (CST)
Received: from mse01.zte.com.cn (unknown [10.30.3.20]) by Websense Email Security Gateway with ESMTPS id 3504E72518D; Fri, 14 Dec 2012 11:06:34 +0800 (CST)
Received: from notes_smtp.zte.com.cn ([10.30.1.239]) by mse01.zte.com.cn with ESMTP id qBE3HGM1006133; Fri, 14 Dec 2012 11:17:16 +0800 (GMT-8) (envelope-from zhai.hongjun@zte.com.cn)
In-Reply-To: <CAFOuuo5o+=YT3TOVRp1Kxm_M3vL1Ko_1enb5fg2HuKjiUFRrKQ@mail.gmail.com>
To: Radia Perlman <radiaperlman@gmail.com>
MIME-Version: 1.0
X-KeepSent: A7FD3909:DFC4C7F6-48257AD4:0010B0EE; type=4; name=$KeepSent
X-Mailer: Lotus Notes Release 6.5.6 March 06, 2007
Message-ID: <OFA7FD3909.DFC4C7F6-ON48257AD4.0010B0EE-48257AD4.00124CC7@zte.com.cn>
From: zhai.hongjun@zte.com.cn
Date: Fri, 14 Dec 2012 11:17:13 +0800
X-MIMETrack: Serialize by Router on notes_smtp/zte_ltd(Release 8.5.3FP1 HF212|May 23, 2012) at 2012-12-14 11:17:08, Serialize complete at 2012-12-14 11:17:08
Content-Type: multipart/alternative; boundary="=_alternative 00124CC648257AD4_="
X-MAIL: mse01.zte.com.cn qBE3HGM1006133
Cc: Thomas Narten <narten@us.ibm.com>, trill-bounces@ietf.org, Sam Aldrin <aldrin.ietf@gmail.com>, Mingui Zhang <zhangmingui@huawei.com>, "trill@ietf.org" <trill@ietf.org>, "Tissa Senevirathne (tsenevir)" <tsenevir@cisco.com>
Subject: Re: [trill] Thoughts on active-active edge
X-BeenThere: trill@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Developing a hybrid router/bridge." <trill.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/trill>, <mailto:trill-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/trill>
List-Post: <mailto:trill@ietf.org>
List-Help: <mailto:trill-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/trill>, <mailto:trill-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Dec 2012 03:17:35 -0000
Hi Radia Thanks for your answering. > Why it is necessary to have a different pseuonode nickname if the upllnk is to > different sets of RBridges: > If hypervisor H1 has uplinks to R1, R2, and R3, and uses pseudonode nickname P1, > and hypervisor H2 has uplinks to R1 and R2 (or even had uplinks to R1, R2, and R3, > but its link to R3 fails), then if H1 and H2 use the same nickname, say P1, > then traffic for H2's MAC addresses might get sent to R3 (since R3 has to claim to > be connected to P1 because it is, for H1). But it is no longer attached to H2 > because H2's uplink to R3 failed. I think you are right if member RBriges of an RBv do not share their learned addresses (of local attathed end nodes and remote nodes) among the RBv. If they share the learned addresses, RB3 will know it can reach H2 via RB1 or RB2. Since H2 has uplinks to RB1 and RB2, the two RBridges can learn H2's MAC address by oberving H2's native frames. Then RB1 and RB2 can share H2's MAC within the RBv, so RB3 will know (from the shared information) that it can reach H2 via RB1 or RB2. On receipt of traffic for H2's MAC addresses, RB3 will tunnel the traffic to RB1 or RB2 and the latter RB egress the traffic to H2. As for the traffic tunneling, it has been specified in draft-ietf-trill-clear-correct-06.txt (Section 2.4.2.1 in Page 9 and Section 2.4.2.3 in Pag2 10). The tunneling is simple, RB3 replaces the egress nickname in the TRILL header of the traffic with RB1 or RB2's nickname, then transmits the re-encapsultion traffic to RB1 or RB2. Therefore, After the MAC sharing among member Rbridges of an RBv and the traffic tunneling are support, it is not necessary one nickname per hypervisor. The safest thing would be for every hypervisor to have a nickname. > So, how many hypervisors are there likely to be? How usual would it be for all of > them to attach to the same set of uplinks, so that we can use the same pseuodnode nickname? I do not know how many hypervisors there likely be, but I know one RBridge usually has 24 or 48 down-links that can acts as uplinks for several hypervisors. So about 10 or 20 hypervisors can be dual-homed to the same two of such Rbridges. If those hypervisors use same a pseudo-nickname, it not only saves nicknames but also decreases the RPFC entries on the RBriges in the TRILL campus scope. > Do we care about the case of one of a hypervisor's uplinks failing, in which case, > would the RBridges know? If R3 (the one to which the uplink failed) know? Would R1 and R2 know? I think the member RBridge SHOULD know the uplink failure and tell other member RBridges the failure, if that Rbridge is directly connected to hypervisor which uplink fails. Otherwise, we can make sure the multi-destination traffic can be properly egressed to the hypervisor. > So the main sort of configuration that I can think of, off the top of my head, > is which pseudonodes go with which hypervisors. If hypervisors do not make TRILL-encapsulation/decapsulation, they do not need to know which pseudonodes go with them. But if they do the encapsulation/decapsultion, they really need to know. > I don't know how R1 can know that a particular port is to "H1" so that it can inform R2 > (via LSPs?) that R1 is attached to H1, and R2 can notice that it, indeed, is also attached to H1. If I not misunderstood your meaning, I think R1 can learn H1's MAC addresses by observing H1's native frames. Then it will know which particular port is to H1. R2 can also do the learning. If R2 has not learned H1's MAC addresses, R1 can share the addressed with it via ESADI PDU. > And part of the description of the problem would be answering questions like > how many uplinks would need to be supported. Two at most? 30? If a lot, > then solutions that require a tree for every uplink would be problematic if > implementations don't want to support that many trees. Or is it OK to require > lots of trees? In my mind, some vendors say their proprietary MC-LAG technologies can support at most 8 member devices in theory. But in practical deployment, two or three member devices in a MAC-LAG group are OK. So I think 8 trees is supported is OK at current. It can meet the practical requirements and does not make mass RPFC entries. If I am wrong, please correcte me. Best Regards, Zhai Hongjun """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" Protocol Development Dept.VI, Central R&D Institute, ZTE Corporation No. 68, Zijinghua Road, Yuhuatai District, Nanjing, P.R.China, 210012 Zhai Hongjun Tel: +86-25-52877345 Email: zhai.hongjun@zte.com.cn """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" Radia Perlman <radiaperlman@gmail.com> 发件人: trill-bounces@ietf.org 2012-12-14 02:10 收件人 zhai.hongjun@zte.com.cn 抄送 Thomas Narten <narten@us.ibm.com>, trill-bounces@ietf.org, Sam Aldrin <aldrin.ietf@gmail.com>, Mingui Zhang <zhangmingui@huawei.com>, "trill@ietf.org" <trill@ietf.org>, "Tissa Senevirathne \(tsenevir\)" <tsenevir@cisco.com> 主题 Re: [trill] Thoughts on active-active edge Answering Zhai Hongjun's questions: Why it is necessary to have a different pseuonode nickname if the upllnk is to different sets of RBridges: If hypervisor H1 has uplinks to R1, R2, and R3, and uses pseudonode nickname P1, and hypervisor H2 has uplinks to R1 and R2 (or even had uplinks to R1, R2, and R3, but its link to R3 fails), then if H1 and H2 use the same nickname, say P1, then traffic for H2's MAC addresses might get sent to R3 (since R3 has to claim to be connected to P1 because it is, for H1). But it is no longer attached to H2 because H2's uplink to R3 failed. The safest thing would be for every hypervisor to have a nickname. So, how many hypervisors are there likely to be? How usual would it be for all of them to attach to the same set of uplinks, so that we can use the same pseuodnode nickname? Do we care about the case of one of a hypervisor's uplinks failing, in which case, would the RBridges know? If R3 (the one to which the uplink failed) know? Would R1 and R2 know? Even if R3 knew, how could it alert R1 and R2 to now use a different pseudonode nickname for H2? Would it be obvious to them which of the hypervisors that have uplinks to R1, R2, and R3 they are referring to? All of this must be configured, I assume, and if the configuration is wrong, then who knows what happens..presumably that traffic may or may not get delivered to a hypervisor. And even if configured properly, what happens when uplinks fail? Again, presumably, traffic may or may not get delivered to the hypervisor whose uplink fails. ------------- As for VLANs...that actually is not a problem here, since we're not using AFs. The hypervisor determines which uplink to send something to. And which tree is being used for distribution determines which of R1, R2, or R3 will decapsulate the packet. So the main sort of configuration that I can think of, off the top of my head, is which pseudonodes go with which hypervisors. And I do think configuration is scary, especially if there's no "sanity check" whereby the RBs can compare notes. I don't know how R1 can know that a particular port is to "H1" so that it can inform R2 (via LSPs?) that R1 is attached to H1, and R2 can notice that it, indeed, is also attached to H1. -------- And part of the description of the problem would be answering questions like how many uplinks would need to be supported. Two at most? 30? If a lot, then solutions that require a tree for every uplink would be problematic if implementations don't want to support that many trees. Or is it OK to require lots of trees? So I think there are lots of things that should be written down, as part of describing the problem. Radia On Thu, Dec 13, 2012 at 4:03 AM, <zhai.hongjun@zte.com.cn> wrote: Hi Radia > This case is scary because the RBridges on the uplink cannot see Hellos from each other, > so if misconfigured, at the very least I could imagine multiple RBridges decapsulating > multicast from the campus to the hypervisor. > Anyway...how many uplinks do we need to support? Do we care about problems due to misconfiguration? I don't know what the misconfiguration refers to. Is it the set of VLANs for which an Rbridge acts as AF? > Are there cases where there are lots of hypervisors, where they attach to different subsets of edge RBs? > In that case, we might eat up a lot of nicknames, since if one hypervisor is attached to {R1, R2}, > and another is attached to {R1, R2, R3}, they cannot use the same pseudonode nickname. I don't know why the two sets of RBridges can not use the same pseudo-nickname. If the learned MAC addresses can be shared among member Rbridges of an RBv and TRILL data frames can be tunneled to another member Rbridge that can egress the frame, I think they can use the same pseudo-nickname. If I am wrong, please correct me. Best Regards, Zhai Hongjun """""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" Protocol Development Dept.VI, Central R&D Institute, ZTE Corporation No. 68, Zijinghua Road, Yuhuatai District, Nanjing, P.R.China, 210012 Zhai Hongjun Tel: +86-25-52877345 Email: zhai.hongjun@zte.com.cn """"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""" Radia Perlman <radiaperlman@gmail.com> 发件人: trill-bounces@ietf.org 2012-12-13 15:13 收件人 Mingui Zhang <zhangmingui@huawei.com> 抄送 Thomas Narten <narten@us.ibm.com>, Sam Aldrin <aldrin.ietf@gmail.com>, "Tissa Senevirathne \(tsenevir\)" <tsenevir@cisco.com>, "trill@ietf.org" < trill@ietf.org> 主题 Re: [trill] Thoughts on active-active edge I think it would be good to have a document that explains the problem...I certainly don't believe I know all the cases that need to be solved. I think I understand the hypervisor case...where the hypervisor decides which uplink to send things to, and never forwards between the up-links. This case is scary because the RBridges on the uplink cannot see Hellos from each other, so if misconfigured, at the very least I could imagine multiple RBridges decapsulating multicast from the campus to the hypervisor. Anyway...how many uplinks do we need to support? Do we care about problems due to misconfiguration? In cases like this, is it common to also have pt-to-pt links between all the RBs attaching to the hypervisor? If so, then it seems like it would be possible for them to coordinate to at least detect misconfiguration, and possibly play games with forwarding messages to each other (e.g., if one of them is not attached to a tree and needs to encapsulate a multidestination frame). How many trees does the campus need? Are there cases where there are lots of hypervisors, where they attach to different subsets of edge RBs? In that case, we might eat up a lot of nicknames, since if one hypervisor is attached to {R1, R2}, and another is attached to {R1, R2, R3}, they cannot use the same pseudonode nickname. Are there cases other than hypervisors? I think there are cases of bridges that have this behavior (a port with a bunch of endnodes, and several up-links, where the bridge does not forward between the up-links. If this has been written down anywhere, can anyone point me to it? If not, it seems really prudent to answer these (and I'm sure other) questions before arguing about specific solutions. Radia _______________________________________________ trill mailing list trill@ietf.org https://www.ietf.org/mailman/listinfo/trill _______________________________________________ trill mailing list trill@ietf.org https://www.ietf.org/mailman/listinfo/trill
- [trill] Thoughts on active-active edge Radia Perlman
- Re: [trill] Thoughts on active-active edge Sunny Rajagopalan
- Re: [trill] Thoughts on active-active edge Radia Perlman
- Re: [trill] Thoughts on active-active edge Sunny Rajagopalan
- Re: [trill] Thoughts on active-active edge Tissa Senevirathne (tsenevir)
- Re: [trill] Thoughts on active-active edge Radia Perlman
- Re: [trill] Thoughts on active-active edge Tissa Senevirathne (tsenevir)
- Re: [trill] Thoughts on active-active edge Mingui Zhang
- Re: [trill] Thoughts on active-active edge Thomas Narten
- Re: [trill] Thoughts on active-active edge Sam Aldrin
- Re: [trill] Thoughts on active-active edge Thomas Narten
- Re: [trill] Thoughts on active-active edge Sam Aldrin
- Re: [trill] Thoughts on active-active edge Thomas Narten
- Re: [trill] Thoughts on active-active edge Tissa Senevirathne (tsenevir)
- Re: [trill] Thoughts on active-active edge Sam Aldrin
- Re: [trill] Thoughts on active-active edge Mingui Zhang
- Re: [trill] Thoughts on active-active edge Radia Perlman
- Re: [trill] Thoughts on active-active edge zhai.hongjun
- Re: [trill] Thoughts on active-active edge zhai.hongjun
- Re: [trill] Thoughts on active-active edge Radia Perlman
- Re: [trill] Thoughts on active-active edge hu.fangwei
- Re: [trill] Thoughts on active-active edge zhai.hongjun
- Re: [trill] Thoughts on active-active edge zhai.hongjun
- Re: [trill] Thoughts on active-active edge Mingui Zhang
- Re: [trill] Thoughts on active-active edge Anoop Ghanwani
- [trill] Problem Statement: TRILL Active/Active Ed… Mingui Zhang