Re: [trill] TRILL Resilient Distribution Trees, who have read it?

Mingui Zhang <zhangmingui@huawei.com> Fri, 07 December 2012 02:19 UTC

Return-Path: <zhangmingui@huawei.com>
X-Original-To: trill@ietfa.amsl.com
Delivered-To: trill@ietfa.amsl.com
Received: from localhost (localhost [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id 97A2F21E8044 for <trill@ietfa.amsl.com>; Thu, 6 Dec 2012 18:19:48 -0800 (PST)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -6.298
X-Spam-Level:
X-Spam-Status: No, score=-6.298 tagged_above=-999 required=5 tests=[AWL=-0.300, BAYES_00=-2.599, HTML_MESSAGE=0.001, J_CHICKENPOX_23=0.6, RCVD_IN_DNSWL_MED=-4]
Received: from mail.ietf.org ([64.170.98.30]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id gE+58fk6zPfG for <trill@ietfa.amsl.com>; Thu, 6 Dec 2012 18:19:47 -0800 (PST)
Received: from lhrrgout.huawei.com (lhrrgout.huawei.com [194.213.3.17]) by ietfa.amsl.com (Postfix) with ESMTP id D8FBF21E8039 for <trill@ietf.org>; Thu, 6 Dec 2012 18:19:45 -0800 (PST)
Received: from 172.18.7.190 (EHLO lhreml203-edg.china.huawei.com) ([172.18.7.190]) by lhrrg01-dlp.huawei.com (MOS 4.3.5-GA FastPath queued) with ESMTP id ANO30530; Fri, 07 Dec 2012 02:19:42 +0000 (GMT)
Received: from LHREML401-HUB.china.huawei.com (10.201.5.240) by lhreml203-edg.huawei.com (172.18.7.221) with Microsoft SMTP Server (TLS) id 14.1.323.3; Fri, 7 Dec 2012 02:19:07 +0000
Received: from SZXEML463-HUB.china.huawei.com (10.82.67.206) by lhreml401-hub.china.huawei.com (10.201.5.240) with Microsoft SMTP Server (TLS) id 14.1.323.3; Fri, 7 Dec 2012 02:19:40 +0000
Received: from SZXEML507-MBS.china.huawei.com ([169.254.7.234]) by szxeml463-hub.china.huawei.com ([10.82.67.206]) with mapi id 14.01.0323.003; Fri, 7 Dec 2012 10:19:37 +0800
From: Mingui Zhang <zhangmingui@huawei.com>
To: gayle noble <windy_1@skyhighway.com>, "trill@ietf.org" <trill@ietf.org>
Thread-Topic: [trill] TRILL Resilient Distribution Trees, who have read it?
Thread-Index: AQHN09JpZhve9DqD7EesU3cs7iLnFpgMmqFf
Date: Fri, 07 Dec 2012 02:19:37 +0000
Message-ID: <4552F0907735844E9204A62BBDD325E7321373B9@SZXEML507-MBS.china.huawei.com>
References: <4552F0907735844E9204A62BBDD325E732132977@SZXEML507-MBS.china.huawei.com> <A1260C20-4A5D-4F74-A283-0B9385D3B4B2@gmail.com> <4552F0907735844E9204A62BBDD325E7321371A1@SZXEML507-MBS.china.huawei.com> <892B6174-3271-4500-9EB5-75BBE95C7A3B@gmail.com>, <201212061654.qB6Gshnv041879@skyhighway.com>
In-Reply-To: <201212061654.qB6Gshnv041879@skyhighway.com>
Accept-Language: en-US, zh-CN
Content-Language: en-US
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.47.108.15]
Content-Type: multipart/alternative; boundary="_000_4552F0907735844E9204A62BBDD325E7321373B9SZXEML507MBSchi_"
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Subject: Re: [trill] TRILL Resilient Distribution Trees, who have read it?
X-BeenThere: trill@ietf.org
X-Mailman-Version: 2.1.12
Precedence: list
List-Id: "Developing a hybrid router/bridge." <trill.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/trill>, <mailto:trill-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/trill>
List-Post: <mailto:trill@ietf.org>
List-Help: <mailto:trill-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/trill>, <mailto:trill-request@ietf.org?subject=subscribe>
X-List-Received-Date: Fri, 07 Dec 2012 02:19:48 -0000

Hi Gayle,

Thanks for your review! I will soon update the draft and embody those changes you’ve suggested.
Responses to your questions are offered as follows.

>Question: if this computation is done independently what do they mean by "the same result is got across the whole campus" ??

same [consistent]

>question: Similar as [CMT] .. what does this mean?

In the CMT draft, the nickname of the originator of the Affinity TLV is not explicitly included. When an RBridge receives this TLV, it knows the originator, therefore it can pinpoint the Affinity Link (from ‘originator’ to the RB specified in the ‘AFFINITY RECORD’). This also means, only the upstream-attached node can announce an Affinity Link.

>tag for RPFC relaxing

It would be great if we can introduce a tag here. I’d like to take one step further. I suggest using one bit from the Multi-Topology field as the tag. With this tag, RPFC relaxing will become unnecessary. I remember a guy also mentioned whether we can make use of MT technique in an offline discussion. Unfortunately, the MT feature is not supported by TRILL yet.

RPFC relaxing is expensive. That is why I once suggested in a TRILL face-to-face meeting that the content about the local protection is offered in the draft for the clarification purpose. The rationale is like this: most guys from vendors and operators prefer local protection to global protection. If no local protection content is included, they will be confused.

If the MT feature is supported by TRILL in the future. We can treat the content of local protection as the place-holder. It can easily be updated to make use of the MT technique to replace the RPFC relaxing.

Thanks,
Mingui



________________________________
From: trill-bounces@ietf.org [trill-bounces@ietf.org] on behalf of gayle noble [windy_1@skyhighway.com]
Sent: Friday, December 07, 2012 0:54
To: trill@ietf.org
Subject: Re: [trill] TRILL Resilient Distribution Trees, who have read it?


(in my opinion) corrections:
1. Introduction
Normally, network [Normally, a network] fault will be recovered through a network wide reconvergence of the forwarding states but [states, but] this process is too slow to meet the tight SLA requirements on the service disruption duration.

in the next paragraph:
With backup forwarding states installed in advance, a protection mechanism is possible to restore a interrupted [an interrupted] multicast stream in tens of milliseconds which guarantees the stringent SLA on service disruption

However, the way TRILL constructs distribution trees (DT) is different from the way multicast trees are computed under IP/MPLS therefore [IP/MPLS, therefore] a multicast protection mechanism suitable for TRILL is required.

Global 1:1 protection is used to refer to the mechanism having the multicast source RBridge normally injects [inject] one multicast stream onto the primary DT.

When this stream is detected to be interrupted, the source RBridge switches to the backup DT to inject subsequent multicast stream [streams] until the primary DT is recovered.

Global 1+1 protection is used to refer to the mechanism having the multicast source RBridge always injects [inject] two copies of multicast streams onto the primary DT and backup DT respectively.

In [the] normal case, multicast receivers pick the stream sent along the primary DT and egress it to its local link.


2.2
When RBridges receive an Affinity Link which is an incoming link of RB2. RB2's [RB2, RB2's] incoming links other than the Affinity Link are removed from the full graph of the campus to get a sub graph.

(point four)
  In order to protect a node on the primary tree, a backup tree can be
  set up as lack of this [setup without this] node [mMRT].

A DT does [DT that does] not span all RBridges in the campus may not cover all receivers of many a multicast [many multicast] group (This is different from the multicast trees construction signaled by PIM [RFC4601] or mLDP [RFC6388].).
3.2.1
For example, in Figure 3.1, the backup DT is set up maximally disjoint to the primary DT(The full topology is an combination [a combination] of these two DTs, which is not shown in the figure.).

Except the [Except for the] link between RB1 and RB2, all other links on the primary DT do not overlap with links on the backup DT.

It means that every link on the primary DT except link RB1-RB2 can DT, except link RB1-RB2, can] be protected by the backup DT.

3.2.1.1
But it is desirable that each RBridge independently computes Affinity Links for a backup DT while the same result is got across the whole campus, which enables a distributed deployment and also minimizes configuration . [ configuration.]

Question: if this computation is done independently what do they mean by "the same result is got across the whole campus" ??


Algorithms for MRT [mMRT] may be used to figure out Affinity Links on a backup DT which is maximally disjoint [disjointed] to the primary DT but it only provides a subset of all possible solutions.

Two disjoint [disjointed] (or maximally disjoint [disjointed ]) trees may root from  different nodes, which significantly augments the solution space.

In order to reduce the amount of Affinity TLVs flooded across the campus, only those will not [those not] picked by conventional DT calculation process ought to be recognized as Affinity Links.

3.2.1.2
Similar as [CMT], every Parent RBridge (PRB) of an Affinity Link take [takes] charge of announcing this link in the Affinity TLV.

question: Similar as [CMT] .. what does this mean?

3.2.2. Backup DT Calculation without Affinity Links

This section aims to provide an alternative method to set up the disjoint [disjointed] DT without Affinity Links.

In other words, the two trees will be maximally disjoint [disjointed ].

To sum up, RBridges do precompute [pre-compute] all the trees that might be used but [used, but] only install part of them according to each ingress.

Since the backup DT is intentionally built up maximally disjoint [disjointed ] to the primary DT, when a link fails and interrupts the ongoing multicast traffic sent along the primary DT, it is probably that the backup DT is not affected.


4.1. Pruning the Backup Distribution Tree
  Backup [The backup] DT should be pruned per-VLAN.
But the way backup [way a backup] DT being [is] pruned is different from the way that the primary DT is pruned.

Even though a branch contains no downstream receivers, it is probably [probable] that it should not be pruned for the purpose of protection.

Those redundant links [that] ought to be pruned will not be protected.

4.2
However, when global 1+1 protection or local protection is applied, traffic duplication will happen if multicast receivers accept both copies of multicast [of the multicast] frame from two RPF filters.

In order to avoid such duplication, multicast receivers (egress RBridge) MUST act as merge points to active [activate] a single RPF filter and discard the duplicated frames from the other RPF filter.

5.1
Upon [When] the ingress RBridge is notified about the failure, it immediately makes this switch over.

[The] Ingress RBridge will switch ongoing multicast traffic based on this judgment.

For example, if RB9 does not response while RB10 still responses [ responds], RB7 will presume that link RB1-RB5 and RB5-RB9 are failed.

Accurate link failure detection might help ingress RBridge [RBridges] to make smarter decision but it's out of the scope of this document.

RBridges may make use of [the] RBridge Channel to speed up the failure propagation [RBch]. LSPs for the purpose of failure notification may be sent to the ingress RBridge as unicast TRILL Data using [the] RBridge Channel.

5.2.2. Traffic Forking and Merging

For the sake of protection, transit RBridges SHOULD active [activate] both primary and backup RPF filters, therefore both copies of the multicast frames will pass through transit RBridges.

 5.3.
In the local protection, the Point of Local Repair (PLR) happens at the upstream RBridge connecting the failed link who [link. It is this RBridge that] makes the decision to replicate the multicast traffic to recover this link failure.

Since the ingress RBridge is not necessarily the root of the distribution tree in TRILL, a multicast downstream point may be not [may not be] the descendants of the ingress point on the distribution tree.

5.3.1
But the ingress of the multicast frame MUST be remained [MUST remain] unchanged.


5.3.2. Duplication Suppression

When a PLR starts to sent [send] replicated multicast frames on the backup DT, multicast frames sent along the primary DT are still going on.

5.4
c
However, if the PLR stops redirecting earlier than the ingress RBridge switches to the new primary DT, packet loss may happen; [.]



[my concern: Relaxing the Reverse Path Forwarding check could cause problems unless some additional magic is added. Could a tag be added stating path down. example: RB1-RB5 down. Thus the receiving RBridge knows it can relax its Reverse Path Forwarding check for that path. ]