Re: [mpls] [PWE3] WG Last Call for draft-ietf-pwe3-endpoint-fast-protection-01 - RFC4447

Mingui Zhang <zhangmingui@huawei.com> Thu, 07 August 2014 03:57 UTC

Return-Path: <zhangmingui@huawei.com>
X-Original-To: mpls@ietfa.amsl.com
Delivered-To: mpls@ietfa.amsl.com
Received: from localhost (ietfa.amsl.com [127.0.0.1]) by ietfa.amsl.com (Postfix) with ESMTP id CE3831A0A96; Wed, 6 Aug 2014 20:57:13 -0700 (PDT)
X-Virus-Scanned: amavisd-new at amsl.com
X-Spam-Flag: NO
X-Spam-Score: -4.202
X-Spam-Level:
X-Spam-Status: No, score=-4.202 tagged_above=-999 required=5 tests=[BAYES_00=-1.9, RCVD_IN_DNSWL_MED=-2.3, RP_MATCHES_RCVD=-0.001, SPF_PASS=-0.001] autolearn=ham
Received: from mail.ietf.org ([4.31.198.44]) by localhost (ietfa.amsl.com [127.0.0.1]) (amavisd-new, port 10024) with ESMTP id 69_fecizhvne; Wed, 6 Aug 2014 20:57:08 -0700 (PDT)
Received: from lhrrgout.huawei.com (lhrrgout.huawei.com [194.213.3.17]) (using TLSv1 with cipher RC4-SHA (128/128 bits)) (No client certificate requested) by ietfa.amsl.com (Postfix) with ESMTPS id 898081A0A86; Wed, 6 Aug 2014 20:57:07 -0700 (PDT)
Received: from 172.18.7.190 (EHLO lhreml402-hub.china.huawei.com) ([172.18.7.190]) by lhrrg01-dlp.huawei.com (MOS 4.3.7-GA FastPath queued) with ESMTP id BKY99311; Thu, 07 Aug 2014 03:57:05 +0000 (GMT)
Received: from nkgeml405-hub.china.huawei.com (10.98.56.36) by lhreml402-hub.china.huawei.com (10.201.5.241) with Microsoft SMTP Server (TLS) id 14.3.158.1; Thu, 7 Aug 2014 04:57:05 +0100
Received: from NKGEML512-MBX.china.huawei.com ([169.254.7.78]) by nkgeml405-hub.china.huawei.com ([10.98.56.36]) with mapi id 14.03.0158.001; Thu, 7 Aug 2014 11:56:58 +0800
From: Mingui Zhang <zhangmingui@huawei.com>
To: "erosen@cisco.com" <erosen@cisco.com>, Alexander Vainshtein <Alexander.Vainshtein@ecitele.com>
Thread-Topic: [PWE3] [mpls] WG Last Call for draft-ietf-pwe3-endpoint-fast-protection-01 - RFC4447
Thread-Index: AQHPsY7DDiJw7y3ZEk+Q1kSqiialjpvEXftQ
Date: Thu, 07 Aug 2014 03:56:57 +0000
Message-ID: <4552F0907735844E9204A62BBDD325E76AAAA7F4@nkgeml512-mbx.china.huawei.com>
References: Your message of Tue, 05 Aug 2014 13:59:49 -0000. <9696d0db139d46ffaad7be11340215e8@AM3PR03MB612.eurprd03.prod.outlook.com> <16167.1407340459@erosen-lnx>
In-Reply-To: <16167.1407340459@erosen-lnx>
Accept-Language: en-US, zh-CN
Content-Language: zh-CN
X-MS-Has-Attach:
X-MS-TNEF-Correlator:
x-originating-ip: [10.111.102.175]
Content-Type: text/plain; charset="us-ascii"
Content-Transfer-Encoding: quoted-printable
MIME-Version: 1.0
X-CFilter-Loop: Reflected
Archived-At: http://mailarchive.ietf.org/arch/msg/mpls/rK2ZHoSdMWkCNivUie0Da-RJuMM
Cc: "mpls@ietf.org" <mpls@ietf.org>, pwe3 <pwe3@ietf.org>, "pwe3-chairs@tools.ietf.org" <pwe3-chairs@tools.ietf.org>
Subject: Re: [mpls] [PWE3] WG Last Call for draft-ietf-pwe3-endpoint-fast-protection-01 - RFC4447
X-BeenThere: mpls@ietf.org
X-Mailman-Version: 2.1.15
Precedence: list
List-Id: Multi-Protocol Label Switching WG <mpls.ietf.org>
List-Unsubscribe: <https://www.ietf.org/mailman/options/mpls>, <mailto:mpls-request@ietf.org?subject=unsubscribe>
List-Archive: <http://www.ietf.org/mail-archive/web/mpls/>
List-Post: <mailto:mpls@ietf.org>
List-Help: <mailto:mpls-request@ietf.org?subject=help>
List-Subscribe: <https://www.ietf.org/mailman/listinfo/mpls>, <mailto:mpls-request@ietf.org?subject=subscribe>
X-List-Received-Date: Thu, 07 Aug 2014 03:57:14 -0000

Hi Eric, 

The explanation of the story is very clear. Let me complement a bit.

>- A "primary egress PE" partitions its set of PWs into n sets, where each
>  set is associated with a given "protector".  The "protector" is the backup
>  egress PE for those PWs.

When the backup PE acts as the protector, it is the 'co-located' model. According to the text in the draft, the protector can also be any other PEs or Ps. That is the 'centralized' model. In this model, operators are allowed to designate the protector. Authors can correct me if I am wrong.

>However, when protecting against the failure of the egress AC, packets would
>go first to PE2 then to PE3.  This scenario is a bit different, in that:
>
>- I think this scenario may require an RSVP-TE backup tunnel (on which PHP
>  is not used) from PE2 to PE3.
>
>- PE2 would have to signal PE3 to associate the given PW with the given
>  backup tunnel.
>
>- The label that PE3 assigns to that backup tunnel would identify the
>  context in which the PW label is looked up.
>
>This is perhaps more detail than is actually in the draft ;-) but if it is
>an accurate description of the authors' intentions, the scheme seems to
>work.

I think the failure of the egress AC can't be handled as normal because the failure does not break the anycast IP address. Since this IP address used to be set to the loopback IP address of the PE. We know it's alive even those physical interfaces are broken. So I understand why you advise to handle the AC failures separately using the RSVP-TE backup tunnel.

I think we still have the chance to incorporate it into the single LDP based scheme:

We can replace the anycast IP address with an IP address of a virtual router. A link between the primary PE and the virtual router will appear in the routing. When the primary PE detect the AC failure. It acts as the PLR. It just translates it into a link failure of the routing and starts to use the backup tunnel that is prepared in advance for the link failure.

One concern is that "do we have scalability issue with these backup tunnels"? IOW, do we have to set up backup tunnels per AC? The answer is no. Actually, a set of ACs can multiplex the same backup tunnel if their corresponding CEs are multi-homed to the same pair of <primary PE, backup PE>.

Thanks,
Mingui

>-----Original Message-----
>From: pwe3 [mailto:pwe3-bounces@ietf.org] On Behalf Of Eric Rosen
>Sent: Wednesday, August 06, 2014 11:54 PM
>To: Alexander Vainshtein
>Cc: mpls@ietf.org; pwe3; pwe3-chairs@tools.ietf.org; stbryant@cisco.com
>Subject: Re: [PWE3] [mpls] WG Last Call for
>draft-ietf-pwe3-endpoint-fast-protection-01 - RFC4447
>
>I've looked at this draft, and it seems to me that it does work in LDP-based
>networks (even with DU mode), and that it does not violate any architectural
>principles.  However, I have made certain assumptions that aren't explicitly
>stated in the draft.  (I am assuming that the draft is intended to be
>patterned after the "anycast BGP" scheme that I've heard about about, but
>that is not written up anywhere as far as I know.)  Let me say how I think
>it is intended to work, and the authors can correct me if I'm wrong.
>
>For protection against the failure of the "primary" egress PE:
>
>- A "primary egress PE" partitions its set of PWs into n sets, where each
>  set is associated with a given "protector".  The "protector" is the backup
>  egress PE for those PWs.
>
>- Each such set is associated with an IP address.  This IP address must be
>  known both to the primary PE and to the protector.
>
>  * One can think of this address as an "anycast" address that, to routing,
>    denotes both the primary PE and the protector.  At the primary PE and
>    the protector, the address denotes a particular set of PWs.
>
>  * These anycast addresses are distributed by routing, and good old
>    fashioned MPLS labels get assigned to them by LDP.  The primary PE MAY
>    bind implicit null to that address, but the protector MUST NOT bind
>    implicit null to that address.
>
>  * Routing is set up so that (a) if the primary PE is up, traffic to the
>    anycast address always goes to the primary PE, but (b) if the primary PE
>    is down, traffic to the anycast address goes to the protector instead.
>    This can be achieved by messing with the metrics, e.g.
>
>- For a given PW, the primary PE informs both the protector and the ingress
>  PE  of (a) the label that it (the primary PE) has assigned to that PW, and
>  the "anycast address" that corresponds to that PW.
>
>  This requires some new signaling (beyond that specified in RFC 4447), but
>  the requisite signaling is specified in the draft.
>
>- When sending a packet on the given PW, the ingress PE pushes on the label
>  assigned to the PW by the egress PE.  Then the ingress PE pushes on
>  whatever LDP-assigned label it need to use to get the packet to the
>  anycast address that is associated with the PW.
>
>- When the protector learns that the egress PE has assigned a given label
>  and anycast address to the PW, it stores that label in a context-specific
>  label space associated with that anycast address.  Whenever the protector
>  receives a packet whose top label is the label that the protector has
>  bound to the given anycast address, the protector looks up the next label
>  in the corresponding label space.
>
>Let's look at an example:
>
>CE1---PE1----PLR----PE2----CE2
>               |             |
>               |-----PE3-----|
>
>where PE3 is the protector for a PW connecting CE1 to CE2 (via PE1 and PE2
>respectively).   Call this pseudowire PW1.
>
>Thus there is some IP address, call it A, that PE2 and PE3 both advertise to
>routing.  The metrics are set up so that as long as PE2 is up, it appears to
>be a better path to A than does PE3.
>
>So PE2 assigns label L1 to PW1, and uses T-LDP to inform both PE1 and PE3 of
>this mapping.  To PE1, these are just outgoing labels.  To PE3, however,
>they are incoming labels in a context-specific label space.
>
>PE2 and PE3 both assign labels to A (let's call these labels L2 and L3
>respectively), and distribute the corresponding label bindings to PLR.  PLR
>also assigns a label to A, and distributes that label binding (L4) to PE1.
>
>When PE1 has a packet to send on PW1, it pushes L1, then pushes L4, then
>sends the packet to PLR.  PLR swaps L4 with L2 and sends the packet to PE2.
>(Note that L2 could be implicit NULL.)  PE2 sees label L1, and sends the
>packet to CE2.
>
>Now if PE2 goes down, things happen a bit differently.  When PLR gets a
>packet with L4 on top, it swaps L4 with L3, and sends the packet to PE3.  (In
>this case, of course, L3 cannot be implicit NULL.)
>
>PE3 gets the packet, sees L3 on top, looks it up, and then says "L3 is a
>context label, so I have to pop it and then look up the following label in
>the L3-specific label table."  This context-specific label table contains
>L1, which has been bound to the PW that leads to CE2.  So PE2 pops the label
>and sends the packet to CE2.
>
>I think that's basically it, and the necessary mechanisms do seem to be
>described in the draft.  The draft doesn't actually use the notion of
>"anycast address" explicitly, but I think the intention is that the "context
>labels" are definitely bound to anycast addresses.  I hope the authors will
>correct me if I'm wrong.
>
>Although the above example shows a PLR that is a neighbor of both PE2 and
>PE3, this isn't really necessary, one could easily construct an example
>where PE2 and PE3 have no neighbors in common.
>
>The example above doesn't rely on LDP DoD at all, and doesn't make use of
>any RSVP-TE backup tunnels.
>
>However, when protecting against the failure of the egress AC, packets would
>go first to PE2 then to PE3.  This scenario is a bit different, in that:
>
>- I think this scenario may require an RSVP-TE backup tunnel (on which PHP
>  is not used) from PE2 to PE3.
>
>- PE2 would have to signal PE3 to associate the given PW with the given
>  backup tunnel.
>
>- The label that PE3 assigns to that backup tunnel would identify the
>  context in which the PW label is looked up.
>
>This is perhaps more detail than is actually in the draft ;-) but if it is
>an accurate description of the authors' intentions, the scheme seems to
>work.
>
>I don't think this modifies the MPLS architecture at all.  It is true that
>the PE3 needs to maintain a context-specific label table that contains
>labels assigned by PE2.  While one may say that that is a "coordinated label
>space", it is no more than what is required by the nature of
>"upstream-assigned labels".  I.e., if you can't support this level of
>coordination, you just can't support upstream-assigned labels.
>
>Also, I don't think this draft is an "update" to RFC 4447.  You don't need
>to read this draft to implement RFC 4447, nor does it modify the procedures
>of RFC 4447.  It just introduces some new optional signaling and procedures.
>I don't think RFC 4447's requirement to use the "platform label space"
>should be interpreted as prohibiting the use of upstream-assigned labels.
>The intention of RFC 4447 was to prohibit the use of the interface-specific
>label space for PW labels.  This prohibition was necessary because the
>ingress PE doesn't know which of the egress PE's network-facing interfaces
>will actually receive a given packet.  This draft doesn't have that problem,
>as it provides clear procedures for determining the context in which the
>upstream-assigned labels are looked up.
>
>
>
>_______________________________________________
>pwe3 mailing list
>pwe3@ietf.org
>https://www.ietf.org/mailman/listinfo/pwe3