Re: [trill] Thoughts on active-active edge

"Tissa Senevirathne (tsenevir)" <tsenevir@cisco.com> Wed, 12 December 2012 06:11 UTC

Return-Path: <tsenevir@cisco.com>
X-Original-To: trill@ietfa.amsl.com
Delivered-To: trill@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 098E621F887A for <trill@ietfa.amsl.com>; Tue, 11 Dec 2012 22:11:03 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -9.547
X-Spam-Level:
X-Spam-Status: No, score=-9.547 tagged_above=-999 required=5 tests=[AWL=1.051, BAYES_00=-2.599, HTML_MESSAGE=0.001, RCVD_IN_DNSWL_HI=-8]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 3dVfpxuPLXqL for <trill@ietfa.amsl.com>; Tue, 11 Dec 2012 22:10:58 -0800 (PST)
Received: from rcdn-iport-8.cisco.com (rcdn-iport-8.cisco.com [173.37.86.79]) by ietfa.amsl.com (Postfix) with ESMTP id 218CE21F88A4 for <trill@ietf.org>; Tue, 11 Dec 2012 22:10:57 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=cisco.com; i=@cisco.com; l=14705; q=dns/txt; s=iport; t=1355292658; x=1356502258; h=from:to:cc:subject:date:message-id:references: in-reply-to:mime-version; bh=4q9SNnO7ePBE8a0AmN/2/JnKtIX4MsaSyjw95gj5cBU=; b=bFhgQ6Dp2RnYdUpkJL2S4NvkshhU+/t6ow7ObvyOYtaCifZ4hafFQEtK 9BX5URtZNLktttAupQ7HEJJIlqcNIuPLy6odJ6d9rHaziBqyErs7gqLZb SzsZC47A7NzBIOrj2gj3yvi0kbs9KPDC1G3lIJBG8JZ5ftz8MX98URp28 U=;
X-IronPort-Anti-Spam-Filtered: true
X-IronPort-Anti-Spam-Result: AgEFANYeyFCtJXG+/2dsb2JhbABFgkm8PBZzgh4BAQEELUAMEAIBCBEEAQELHQchERQJCAIEDgUIh3cDD7JCDYlUi2Jpg2JhA5JUgV2NDYURgnOBZAIFGQYY
X-IronPort-AV: E=Sophos; i="4.84,264,1355097600"; d="scan'208,217"; a="151974507"
Received: from rcdn-core2-3.cisco.com ([173.37.113.190]) by rcdn-iport-8.cisco.com with ESMTP; 12 Dec 2012 06:10:54 +0000
Received: from xhc-aln-x12.cisco.com (xhc-aln-x12.cisco.com [173.36.12.86]) by rcdn-core2-3.cisco.com (8.14.5/8.14.5) with ESMTP id qBC6AsfT000902 (version=TLSv1/SSLv3 cipher=AES128-SHA bits=128 verify=FAIL); Wed, 12 Dec 2012 06:10:54 GMT
Received: from xmb-rcd-x08.cisco.com ([169.254.8.240]) by xhc-aln-x12.cisco.com ([173.36.12.86]) with mapi id 14.02.0318.004; Wed, 12 Dec 2012 00:10:53 -0600
From: "Tissa Senevirathne (tsenevir)" <tsenevir@cisco.com>
To: Radia Perlman <radiaperlman@gmail.com>
Thread-Topic: [trill] Thoughts on active-active edge
Thread-Index: AQHN1+GJAcUGAj4DtkSKdF8jmxk4o5gUnyzwgABxZQD//5vnwA==
Date: Wed, 12 Dec 2012 06:10:52 +0000
Message-ID: <FBEA3E19AA24F847BA3AE74E2FE1935628892EAE@xmb-rcd-x08.cisco.com>
References: <CAFOuuo4zvX5AtD-oGRRftuZaKmhY7C7-SvDjznMOdzUj+Q3fGQ@mail.gmail.com> <FBEA3E19AA24F847BA3AE74E2FE1935628892DF6@xmb-rcd-x08.cisco.com> <CAFOuuo5LP1EzajpeBri2KhTT-wf+vv=JwmTLma9_mxg7dM5PvQ@mail.gmail.com>
In-Reply-To: <CAFOuuo5LP1EzajpeBri2KhTT-wf+vv=JwmTLma9_mxg7dM5PvQ@mail.gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.21.166.158]
Content-Type: multipart/alternative; boundary="_000_FBEA3E19AA24F847BA3AE74E2FE1935628892EAExmbrcdx08ciscoc_"
MIME-Version: 1.0
Cc: "trill@ietf.org" <trill@ietf.org>
Subject: Re: [trill] Thoughts on active-active edge
X-BeenThere: trill@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Developing a hybrid router/bridge." <trill.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/trill>, <mailto:trill-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/trill>
List-Post: <mailto:trill@ietf.org>
List-Help: <mailto:trill-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/trill>, <mailto:trill-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 12 Dec 2012 06:11:03 -0000

Radia

Please see in line

From: Radia Perlman [mailto:radiaperlman@gmail.com]
Sent: Tuesday, December 11, 2012 10:00 PM
To: Tissa Senevirathne (tsenevir)
Cc: trill@ietf.org
Subject: Re: [trill] Thoughts on active-active edge

Tissa, there are always technical tradeoffs between approaches.

Tissa said:
>>[Answer] This depends on the traffic Radia. You cannot assume all multicast hash to Node R1 and all unicast to >>R2 and R3 and hence multicast unicast address ping pong is rare. In the contrary, you could have heavy >>multicast stream and a unicast stream both hashing in to the same RBridge RB1.

I'm not sure what relevance there is to whether unicast hashes to R1, R2, R3, or alternates between them, since the rule is that they all use the pseudonode nickname when ingressing from E. The only way there will be frequent endnode moves for E is if E frequently alternates sending multicast and unicast.  All unicast from E, whether it is sent to R1, R2, or R3, will be perceived by R8 as coming from pseudonode P.  Multicast, on the other hand, with the proposed rule, could bounce between R1, R2, and R3.  Though as an optimization, if the pseudonode P is attached to R2, then R2 could use P when ingressing either unicast or multicast, so multicast that happens to hash to R2 will not be perceived as an endnode move, whereas multicast hashing to R1 or R3 would.


[Answer-2] Are you saying E will not simultaneously send multicast and unicast ? are you also suggesting there is either a single multicast stream or all multicast streams are hashing to the same RBridge that E is attached to ?

Tissa said about why endnode moves are a problem:

>>[Answer] Yes it is big time.  Unlike ancient devices modern devices to hardware based learning. When address >>association start moving, it will constantly bombard the CPU with new address or address move notifications.

Your description sounds to me like it's only a problem if there are a lot of moves.  Since it seems unlikely that E will keep alternating between sending multicast and unicast, or even if there were one node that was doing that for some reason, that there would be lots of nodes acting this way, it seems like the amount of moves seen, relative to the total amount of traffic received by R8, should be really small.

[Answer-2] It is not lots of switches it is how many ping-pong, with a single address and two stream you can overload learning buffers when traffic is coming at high rates.

So, the disadvantages of this proposal, as Sunny pointed out, are
1) the active-active RBs might need a hardware change in order to selectively use a different ingress nickname depending on whether the packet is multicast or unicast  (which implementations is this true for?)
2) distant RBs (say R8), will perceive E moving when E switches from sending unicast to multicast.

[Answer-2] #2 is exactly what I am saying as a big problem. But if you feel it is not a problem, I am not going to argue any further. You should speak to few more people and get the validation. ***Address move is a major problem**

But as I said, all approaches have pluses and minuses, so the minuses have to be weighed against the pluses.

The advantages of this proposed approach relative to the Affinity TLV are:

a) you only need one tree for the campus, and you can have arbitrarily many active-active nodes.  Or to put it a different way, you can have more active-active RBs than there are trees.

b) furthermore, it allows the active-active RBs to do load splitting of multicast (choosing more than one tree), without requiring n*k trees, where n is the number of active-active RBs, and k is the number of trees each might want to use for load splitting.

c) you don't need all RBs to change at once in the campus (as you would with the Affinity proposal).  If R8 is unmodified, R8 will see E changing when E changes from multicast to unicast, but will not have incorrect behavior.  With affinity, absolutely every RB in the campus must be modified or there will be incorrect (not just suboptimal) behavior.


Radia