Re: [trill] draft-ietf-trill-cmt

"Pat Thaler" <pthaler@broadcom.com> Wed, 21 November 2012 01:07 UTC

Return-Path: <pthaler@broadcom.com>
X-Original-To: trill@ietfa.amsl.com
Delivered-To: trill@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id D017221F877A for <trill@ietfa.amsl.com>; Tue, 20 Nov 2012 17:07:13 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.599
X-Spam-Level:
X-Spam-Status: No, score=-6.599 tagged_above=-999 required=5 tests=[AWL=0.000, BAYES_00=-2.599, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id QGscoqnFAgwB for <trill@ietfa.amsl.com>; Tue, 20 Nov 2012 17:07:11 -0800 (PST)
Received: from mms3.broadcom.com (mms3.broadcom.com [216.31.210.19]) by ietfa.amsl.com (Postfix) with ESMTP id A637821F8773 for <trill@ietf.org>; Tue, 20 Nov 2012 17:07:11 -0800 (PST)
Received: from [10.16.192.224] by mms3.broadcom.com with ESMTP (Broadcom SMTP Relay (Email Firewall v6.5)); Tue, 20 Nov 2012 17:02:54 -0800
X-Server-Uuid: B86B6450-0931-4310-942E-F00ED04CA7AF
Received: from SJEXCHCAS01.corp.ad.broadcom.com (10.16.192.31) by SJEXCHHUB01.corp.ad.broadcom.com (10.16.192.224) with Microsoft SMTP Server (TLS) id 8.2.247.2; Tue, 20 Nov 2012 17:07:04 -0800
Received: from SJEXCHMB09.corp.ad.broadcom.com ( [fe80::3da7:665e:cc78:181f]) by sjexchcas01.corp.ad.broadcom.com ( [::1]) with mapi id 14.01.0355.002; Tue, 20 Nov 2012 17:07:04 -0800
From: Pat Thaler <pthaler@broadcom.com>
To: Jon Hudson <jon.hudson@gmail.com>, Donald Eastlake <d3e3e3@gmail.com>
Thread-Topic: [trill] draft-ietf-trill-cmt
Thread-Index: AQHNx1QVodkXkqiQ+kSKjCvVQ4TKqpfzb/Iw
Date: Wed, 21 Nov 2012 01:07:03 +0000
Message-ID: <EB9B93801780FD4CA165E0FBCB3C3E671DF01F29@SJEXCHMB09.corp.ad.broadcom.com>
References: <FBEA3E19AA24F847BA3AE74E2FE19356237B70B7@xmb-rcd-x08.cisco.com> <201211192208.qAJM87O1007297@cichlid.raleigh.ibm.com> <CAF4+nEFqAap=oN=bTxBf3K3Zj7GJy-ZcWP7-efJbftq5sOPjGQ@mail.gmail.com> <D1EB4C66-B1E9-4FAF-AF5C-14D9D8C5C6FC@gmail.com>
In-Reply-To: <D1EB4C66-B1E9-4FAF-AF5C-14D9D8C5C6FC@gmail.com>
Accept-Language: en-US
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.16.203.100]
MIME-Version: 1.0
X-WSS-ID: 7CB2F7B43P86676412-01-01
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
Cc: Thomas Narten <narten@us.ibm.com>, "Tissa Senevirathne (tsenevir)" <tsenevir@cisco.com>, "trill@ietf.org" <trill@ietf.org>
Subject: Re: [trill] draft-ietf-trill-cmt
X-BeenThere: trill@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Developing a hybrid router/bridge." <trill.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/trill>, <mailto:trill-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/trill>
List-Post: <mailto:trill@ietf.org>
List-Help: <mailto:trill-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/trill>, <mailto:trill-request@ietf.org?subject=subscribe>
X-List-Received-Date: Wed, 21 Nov 2012 01:07:13 -0000

Jon and Donald

I'm skeptical about a "building block" to use in future solutions in general and especially for link aggregation or teaming. It's a fairly difficult to generically make a group of bridges pretend to be a single bridge and one should start with an architecture for solution, IMO, rather than starting with pieces that you hope to pull into a future solution. 

"A typical deployment scenario, depicted in Figure
   1, which may have either End Stations and/or Legacy bridges attached
   to the RBridges.  These Legacy devices typically are multi-homed to
   several RBridges and treat all of the uplinks as a single Link
   Aggregation (LAG) bundle [802.1AX]."
End stations support many forms of teaming so they don't "typically" use any one  form. For instance, one is a bridge-independent form of active-active teaming where the end station load balances traffic to various links with different MAC addresses on each link. They generally can't treat links to separate bridges as a single 802.1AX LAG unless the bridges cooperate to support a proprietary multi-chassis LAG. 

LAG is defined by 802.1AX and that term shouldn't be used for other mechanisms for using multiple links. It's better to use something else like "teaming" when one wants to include both 802.1AX and non-802.1AX mechanisms. The groupings identified as LAGs in Figure 1 will not be LAGs unless the bridges are presenting them as a single bridge to the end stations which isn't trivial to get right and is currently proprietary.

DRNI (dual resilient network interconnect) - the capability being added in the 802.1AX-Rev project has a focused objective of defining a mechanism for using a LAG between two provider networks (that's why it's called a "network interconnect") where the links end in more than one bridge (probably two or three bridges) on each side of the aggregation.  End node attachment to a multi-bridge LAG host isn't an objective.

Pat

-----Original Message-----
From: Jon Hudson [mailto:jon.hudson@gmail.com] 
Sent: Monday, November 19, 2012 10:24 PM
To: Donald Eastlake
Cc: Thomas Narten; Tissa Senevirathne (tsenevir); trill@ietf.org
Subject: Re: [trill] draft-ietf-trill-cmt

Just to add to what Donald said...

This is a case where the draft came out of a need that kept appearing while discussing and trying to solve other challenges ( like a/a host based connections).

As apposed to a "draft if search of a problem" this is really a case of a draft providing a building block that by not existing would prevent other solutions from coming forward.

Now as to the terminology. There is much confusion in the market and with customers as to when something is or is not LAG.

One if of the limitations today of host based LAGs is that they have to go to the same physical switch unless the target switch is part if a proprietary MCT pair, and that the MCT group is itself only a pair.

What many customers want (and is provided today by other proprietary solutions) is N active links to N top of rack switches where the initial common two configs would be two links to two TOR switches from one host, or four active links from a host to four different TOR switches.

This maybe a cause of the confusion caused by the document saying N switches. And is part of the value since as you stated most MLAG solutions are only in a pair, and subpar as a result. 

J

On Nov 19, 2012, at 8:58 PM, Donald Eastlake <d3e3e3@gmail.com> wrote:

> Hi Thomas,
> 
> Let me describe what I think it going on.
> 
> On Mon, Nov 19, 2012 at 5:08 PM, Thomas Narten <narten@us.ibm.com> wrote:
>> Hi.
>> 
>> I've got a fairly basic question about this document.
>> 
>> What exact customer/deployment scenario is it trying to address?
> 
> Scenarios in which customer wants active-active end station connectivity.
> 
> TRILL can provide ECMP and multi-pathing and rapid fail-over in the
> links between RBridges. This draft is aimed at specifying a building
> block (the Affinity sub-TLV) and giving an example of use of that
> building block towards getting the same capabilities for end station
> connectivity to the campus..
> 
>> It says:
>> 
>>>   This document specifies a concept of Affinity sub-TLV to solve
>>>   associated RPF issues at the active-active edge. Specific methods in
>>>   this document for making use of the Affinity sub-TLV are applicable
>>>   where multiple RBridges are connected to an edge device through link
>>>   aggregation or to a multiport server or some similar arrangement
>>>   where the RBridges cannot see each other's Hellos.
>> 
>> I view Link Aggregation Groups (LAGs) as an L2 construct that would be
>> hidden from TRILL. I.e., in Ethernet, LAG is used to "hide" the
>> presence of multiple links between a pair of switches from STP, so STP
>> only sees one link (the LAG) rather than the individual links. Thus, I
>> would expect RBs to want to only know of a single link (i.e., the LAG)
>> and not have any understanding that the link may actually be a LAG.
> 
> There are various references in various TRILL documents that agree
> with you that the classic two end-point LAG appears to TRILL as a
> single link with a single port at each end. This draft gives, as
> examples of the use of the Affinity sub-TLV, multi-port servers which
> do not forward frames between those ports, and what is usually called
> multi-chassis LAG (MC-LAG) where you have one device (such as a
> bridge) on one end and multiple devices on the other end that conspire
> to act like one device. (MC-LAG is commonly supported but using
> proprietary methods. There is a project in IEEE 802.1, 802.1AX-REV,
> that is seeking to extend LAG to allow more than one device on each
> side.)
> 
> draft-hu-trill-pseudonode-nickname is an example of another use of the
> Affinity sub-TLV not related to LAG. If I recall correctly, at the
> Atlanta meeting I repeatedly paired the trill-pseudonode-nickname
> draft with the trill-cmt draft and suggested people consider them
> together.
> 
>> Moreover, Figure 1 is not describing LAG, but multi-chassis LAG
>> (MLAG). MLAG is not standardized (all implementations are proprietary
>> and there is no interoperability between vendors). Moreover, MLAG
>> implementations are pretty restricted -- last I checked, MLAG products
>> were limited to only 2 switches. So the figure in the document showing
>> MLAG involving an arbitrary number of "N" switches seems a stretch.
> 
> I'm not sure the number of RBridges in the edge group matters so much
> or that you know about all MC-LAG implementations. In any case, a
> multi-port server could have more than two ports.
> 
>> Do I have the right understanding of what is meant by "LAG" in the
>> draft?
> 
> I think so -- that is, you correctly understood that in this draft it
> actually means MC-LAG.
> 
>> The idea that RBs would be aware of individual links within a LAG
>> kind of defeats the purpose of having LAGs. Is that really what is
>> being proposed?
> 
> The draft is intended to specify the Affinity sub-TLV and to provide
> an example where an edge group of RBridges cannot see each other's
> Hellos. For the sub-part of the example using MC-LAG, rather than a
> multi-port server, I would say the MC-LAG is doing exactly what the
> purpose of MC-LAG is.
> 
> Thanks,
> Donald
> =============================
> Donald E. Eastlake 3rd   +1-508-333-2270 (cell)
> 155 Beaver Street, Milford, MA 01757 USA
> d3e3e3@gmail.com
> 
>> Thomas
> _______________________________________________
> trill mailing list
> trill@ietf.org
> https://www.ietf.org/mailman/listinfo/trill