Re: [trill] Thoughts on active-active edge

zhai.hongjun@zte.com.cn Fri, 14 December 2012 03:33 UTC

Return-Path: <zhai.hongjun@zte.com.cn>
X-Original-To: trill@ietfa.amsl.com
Delivered-To: trill@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 9CA0C21F8849; Thu, 13 Dec 2012 19:33:33 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -96.399
X-Spam-Level:
X-Spam-Status: No, score=-96.399 tagged_above=-999 required=5 tests=[AWL=-0.600, BAYES_00=-2.599, HTML_MESSAGE=0.001, J_CHICKENPOX_74=0.6, J_CHICKENPOX_83=0.6, MIME_BASE64_TEXT=1.753, MIME_CHARSET_FARAWAY=2.45, USER_IN_WHITELIST=-100, WEIRD_QUOTING=1.396]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 0N-KvKZkVctk; Thu, 13 Dec 2012 19:33:32 -0800 (PST)
Received: from zte.com.cn (mx6.zte.com.cn [95.130.199.165]) by ietfa.amsl.com (Postfix) with ESMTP id 63F7A21F87B3; Thu, 13 Dec 2012 19:33:31 -0800 (PST)
Received: from zte.com.cn (unknown [192.168.168.119]) by Websense Email Security Gateway with ESMTP id 7F3F613236; Fri, 14 Dec 2012 11:33:20 +0800 (CST)
Received: from mse01.zte.com.cn (unknown [10.30.3.20]) by Websense Email Security Gateway with ESMTPS id BF84272537F; Fri, 14 Dec 2012 11:22:29 +0800 (CST)
Received: from notes_smtp.zte.com.cn ([10.30.1.239]) by mse01.zte.com.cn with ESMTP id qBE3XEAV029443; Fri, 14 Dec 2012 11:33:14 +0800 (GMT-8) (envelope-from zhai.hongjun@zte.com.cn)
In-Reply-To: <OF2917B755.CCA8E34E-ON48257AD4.0008BD50-48257AD4.000DE9CB@LocalDomain>
To: hu.fangwei@zte.com.cn
MIME-Version: 1.0
X-KeepSent: 9AA48005:287A265A-48257AD4:00127A93; type=4; name=$KeepSent
X-Mailer: Lotus Notes Release 6.5.6 March 06, 2007
Message-ID: <OF9AA48005.287A265A-ON48257AD4.00127A93-48257AD4.0013C340@zte.com.cn>
From: zhai.hongjun@zte.com.cn
Date: Fri, 14 Dec 2012 11:33:11 +0800
X-MIMETrack: Serialize by Router on notes_smtp/zte_ltd(Release 8.5.3FP1 HF212|May 23, 2012) at 2012-12-14 11:33:07, Serialize complete at 2012-12-14 11:33:07
Content-Type: multipart/alternative; boundary="=_alternative 0013C33E48257AD4_="
X-MAIL: mse01.zte.com.cn qBE3XEAV029443
Cc: Thomas Narten <narten@us.ibm.com>, trill-bounces@ietf.org, Radia Perlman <radiaperlman@gmail.com>, Sam Aldrin <aldrin.ietf@gmail.com>, Mingui Zhang <zhangmingui@huawei.com>, "trill@ietf.org" <trill@ietf.org>, "Tissa Senevirathne (tsenevir)" <tsenevir@cisco.com>
Subject: Re: [trill] Thoughts on active-active edge
X-BeenThere: trill@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Developing a hybrid router/bridge." <trill.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/trill>, <mailto:trill-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/trill>
List-Post: <mailto:trill@ietf.org>
List-Help: <mailto:trill-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/trill>, <mailto:trill-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 14 Dec 2012 03:33:33 -0000

> So pseudonode nickname is not fit for active-active edge.

Maybe I need to clarify what the pseudo-nickname is in PN draft. 

Pseudo-nickname is a nickname that identifies a group of Rbridges(e.g, a 
RBv), not a pseudonode.
And this nickname is advertised in each member RBridge's LSP, not it's 
pseudonode LSP.
Furthormore, there is no member RBridge originates pseudonode LSPs in 
active-active edge,
since they can see each other's hellos.

If pseudo-nickname confuses some one in this WG, we can change it to 
RBv-nickname.


Best Regards,
Zhai Hongjun
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
 Protocol Development Dept.VI, Central R&D Institute, ZTE Corporation
 No. 68, Zijinghua Road, Yuhuatai District, Nanjing, P.R.China, 210012
 
 Zhai Hongjun
 
 Tel: +86-25-52877345
 Email: zhai.hongjun@zte.com.cn
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""





胡方伟175772/user/zte_ltd
2012-12-14 10:31

收件人
Radia Perlman <radiaperlman@gmail.com>
抄送
Sam Aldrin <aldrin.ietf@gmail.com>, Thomas Narten <narten@us.ibm.com>, 
"trill@ietf.org" <trill@ietf.org>, trill-bounces@ietf.org, "Tissa 
Senevirathne \(tsenevir\)" <tsenevir@cisco.com>, zhai.hongjun@zte.com.cn, 
Mingui Zhang <zhangmingui@huawei.com>
主题
Re: Re: [trill] Thoughts on active-active edge







There is only one RBridge(that is AF) to do the TRILL encapuslation in the 
shared LAN link scenario.So we can introduce the pseudonode nickname to 
identy the shared link, and avoid the potential data loss when AF changed. 
While it is different for the active-active edge scenario: all the 
RBridges in the LAG-group should do the TRILL encapsulation, and the 
RBridge cannot see Hellos from each other, there is no AF in this case 
actually. So pseudonode nickname is not fit for active-active edge. 

We could use the control protocol(for example, ESADI )to learn the MAC and 
nickname, and avoid the MAC flip by adding a new ID(link-ID). If the 
remote receives two records with the same Link-ID,it should think both of 
them are Ok. 

 R1 and R2 tell R8 that H1 is multi-homing to the trill campus, they use 
the same Link-ID to identify the LAG-group. When R8 receives the 
information, it keeps two records for H1, one is R1, and the other is R2. 
The back data frame could choice either R1 or R2 as the egress nickname. 

If the link between H1 and R2 fails, the record R2 is clear. 

If there is another link for H1, say R3, and R3 uses the same link-id, so 
R8 keeps another record for H1. 

Best regards. 



Radia Perlman <radiaperlman@gmail.com> 
发件人:  trill-bounces@ietf.org 
2012-12-14 02:10 


收件人
zhai.hongjun@zte.com.cn 
抄送
Thomas Narten <narten@us.ibm.com>, trill-bounces@ietf.org, Sam Aldrin 
<aldrin.ietf@gmail.com>, Mingui Zhang <zhangmingui@huawei.com>, 
"trill@ietf.org" <trill@ietf.org>, "Tissa Senevirathne \(tsenevir\)" 
<tsenevir@cisco.com> 
主题
Re: [trill] Thoughts on active-active edge








Answering Zhai Hongjun's questions: 

Why it is necessary to have a different pseuonode nickname if the upllnk 
is to different sets of RBridges: 

If hypervisor H1 has uplinks to R1, R2, and R3, and uses pseudonode 
nickname P1, and hypervisor H2 has uplinks to R1 and R2 (or even had 
uplinks to R1, R2, and R3, but its link to R3 fails), then if H1 and H2 
use the same nickname, say P1, then traffic for H2's MAC addresses might 
get sent to R3 (since R3 has to claim to be connected to P1 because it is, 
for H1).  But it is no longer attached to H2 because H2's uplink to R3 
failed. 

The safest thing would be for every hypervisor to have a nickname. 

So, how many hypervisors are there likely to be?  How usual would it be 
for all of them to attach to the same set of uplinks, so that we can use 
the same pseuodnode nickname?  Do we care about the case of one of a 
hypervisor's uplinks failing, in which case, would the RBridges know?  If 
R3 (the one to which the uplink failed) know?  Would R1 and R2 know?  Even 
if R3 knew, how could it alert R1 and R2 to now use a different pseudonode 
nickname for H2?  Would it be obvious to them which of the hypervisors 
that have uplinks to R1, R2, and R3 they are referring to? 

All of this must be configured, I assume, and if the configuration is 
wrong, then who knows what happens..presumably that traffic may or may not 
get delivered to a hypervisor.  And even if configured properly, what 
happens when uplinks fail?  Again, presumably, traffic may or may not get 
delivered to the hypervisor whose uplink fails. 

------------- 
As for VLANs...that actually is not a problem here, since we're not using 
AFs.  The hypervisor determines which uplink to send something to.  And 
which tree is being used for distribution determines which of R1, R2, or 
R3 will decapsulate the packet. 

So the main sort of configuration that I can think of, off the top of my 
head, is which pseudonodes go with which hypervisors.  And I do think 
configuration is scary, especially if there's no "sanity check" whereby 
the RBs can compare notes.  I don't know how R1 can know that a particular 
port is to "H1" so that it can inform R2 (via LSPs?) that R1 is attached 
to H1, and R2 can notice that it, indeed, is also attached to H1. 

-------- 
And part of the description of the problem would be answering questions 
like how many uplinks would need to be supported.  Two at most?  30?  If a 
lot, then solutions that require a tree for every uplink would be 
problematic if implementations don't want to support that many trees.  Or 
is it OK to require lots of trees? 

So I think there are lots of things that should be written down, as part 
of describing the problem. 



Radia 





On Thu, Dec 13, 2012 at 4:03 AM, <zhai.hongjun@zte.com.cn> wrote: 

Hi Radia 

> This case is scary because the RBridges on the uplink cannot see Hellos 
from each other, 
> so if misconfigured, at the very least I could imagine multiple RBridges 
decapsulating 
> multicast from the campus to the hypervisor. 

> Anyway...how many uplinks do we need to support?  Do we care about 
problems due to misconfiguration? 

I don't know what the misconfiguration refers to. Is it the set of VLANs 
for which an Rbridge acts as AF? 


> Are there cases where there are lots of hypervisors, where they attach 
to different subsets of edge RBs? 
> In that case, we might eat up a lot of nicknames, since if one 
hypervisor is attached to {R1, R2}, 
> and another is attached to {R1, R2, R3}, they cannot use the same 
pseudonode nickname. 

I don't know why the two sets of RBridges can not use the same 
pseudo-nickname. If the learned MAC addresses 
can be shared among member Rbridges of an RBv and TRILL data frames can be 
tunneled to another member Rbridge 
that can egress the frame, I think they can use the same pseudo-nickname. 

If I am wrong, please correct me. 


Best Regards,
Zhai Hongjun
""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""
Protocol Development Dept.VI, Central R&D Institute, ZTE Corporation
No. 68, Zijinghua Road, Yuhuatai District, Nanjing, P.R.China, 210012

Zhai Hongjun

Tel: +86-25-52877345
Email: zhai.hongjun@zte.com.cn
"""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""""




Radia Perlman <radiaperlman@gmail.com> 
发件人:  trill-bounces@ietf.org 
2012-12-13 15:13 


收件人
Mingui Zhang <zhangmingui@huawei.com> 
抄送
Thomas Narten <narten@us.ibm.com>, Sam Aldrin <aldrin.ietf@gmail.com>, 
"Tissa Senevirathne \(tsenevir\)" <tsenevir@cisco.com>, "trill@ietf.org" <
trill@ietf.org> 
主题
Re: [trill] Thoughts on active-active edge










I think it would be good to have a document that explains the problem...I 
certainly don't believe I know all the cases that need to be solved.  I 
think I understand the hypervisor case...where the hypervisor decides 
which uplink to send things to, and never forwards between the up-links. 

This case is scary because the RBridges on the uplink cannot see Hellos 
from each other, so if misconfigured, at the very least I could imagine 
multiple RBridges decapsulating multicast from the campus to the 
hypervisor. 

Anyway...how many uplinks do we need to support?  Do we care about 
problems due to misconfiguration? 

In cases like this, is it common to also have pt-to-pt links between all 
the RBs attaching to the hypervisor?  If so, then it seems like it would 
be possible for them to coordinate to at least detect misconfiguration, 
and possibly play games with forwarding messages to each other (e.g., if 
one of them is not attached to a tree and needs to encapsulate a 
multidestination frame). 

How many trees does the campus need? 

Are there cases where there are lots of hypervisors, where they attach to 
different subsets of edge RBs?  In that case, we might eat up a lot of 
nicknames, since if one hypervisor is attached to {R1, R2}, and another is 
attached to {R1, R2, R3}, they cannot use the same pseudonode nickname. 

Are there cases other than hypervisors?  I think there are cases of 
bridges that have this behavior (a port with a bunch of endnodes, and 
several up-links, where the bridge does not forward between the up-links.  


If this has been written down anywhere, can anyone point me to it?  If 
not, it seems really prudent to answer these (and I'm sure other) 
questions before arguing about specific solutions. 

Radia 

_______________________________________________
trill mailing list
trill@ietf.org
https://www.ietf.org/mailman/listinfo/trill

_______________________________________________
trill mailing list
trill@ietf.org
https://www.ietf.org/mailman/listinfo/trill